As software systems evolve, large test suites increasingly delay developer feedback in Continuous Integration and Delivery (CI/CD). Test Case Selection and Prioritization (TCSP) mitigates this issue, and recent evidence shows that simple heuristics, such as prioritizing recently failed or fast-running tests, often outperform sophisticated machine learning (ML) approaches, which incur high training costs and suffer from distribution shift. However, these simple heuristics remain rigid and fail to leverage the rich historical data continuously generated by the CI/CD pipelines. This paper introduces DANTE, a data-driven, yet training-free framework that evolves simple TCSP heuristics into a data-driven selection and prioritization strategy. DANTE processes historical signals through a probabilistic stickiness metric that automatically adapts to project-specific failure and fixing patterns, eliminating the need for manual tuning. This history-aware logic is complemented by a lightweight, commit-aware component that captures the relevance between code changes and test artifacts. We evaluated DANTE on the Long-Running Test Suite (LRTS) dataset, focusing on the Java projects, which constitute the vast majority in LRTS (9 out of 10 in total) and comprise more than 21,000 CI builds with multi-hour test suites. The results show that DANTE consistently outperforms state-of-the-art cost-history cognizant heuristics and ML baselines in both test selection and prioritization, while remaining robust to flaky tests.
DANTE: Data-Driven Test Case Selection and Prioritization for Long-Running Test Suites
Simone Reale;Elisabetta Di Nitto;Luciano Baresi;Massimiliano Di Penta;Giovanni Quattrocchi
2026-01-01
Abstract
As software systems evolve, large test suites increasingly delay developer feedback in Continuous Integration and Delivery (CI/CD). Test Case Selection and Prioritization (TCSP) mitigates this issue, and recent evidence shows that simple heuristics, such as prioritizing recently failed or fast-running tests, often outperform sophisticated machine learning (ML) approaches, which incur high training costs and suffer from distribution shift. However, these simple heuristics remain rigid and fail to leverage the rich historical data continuously generated by the CI/CD pipelines. This paper introduces DANTE, a data-driven, yet training-free framework that evolves simple TCSP heuristics into a data-driven selection and prioritization strategy. DANTE processes historical signals through a probabilistic stickiness metric that automatically adapts to project-specific failure and fixing patterns, eliminating the need for manual tuning. This history-aware logic is complemented by a lightweight, commit-aware component that captures the relevance between code changes and test artifacts. We evaluated DANTE on the Long-Running Test Suite (LRTS) dataset, focusing on the Java projects, which constitute the vast majority in LRTS (9 out of 10 in total) and comprise more than 21,000 CI builds with multi-hour test suites. The results show that DANTE consistently outperforms state-of-the-art cost-history cognizant heuristics and ML baselines in both test selection and prioritization, while remaining robust to flaky tests.| File | Dimensione | Formato | |
|---|---|---|---|
|
ex_ICSE_2026___SOLO_CLASSICO (12).pdf
accesso aperto
:
Pre-Print (o Pre-Refereeing)
Dimensione
881.33 kB
Formato
Adobe PDF
|
881.33 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


