As software systems evolve, large test suites increasingly delay developer feedback in Continuous Integration and Delivery (CI/CD). Test Case Selection and Prioritization (TCSP) mitigates this issue, and recent evidence shows that simple heuristics, such as prioritizing recently failed or fast-running tests, often outperform sophisticated machine learning (ML) approaches, which incur high training costs and suffer from distribution shift. However, these simple heuristics remain rigid and fail to leverage the rich historical data continuously generated by the CI/CD pipelines. This paper introduces DANTE, a data-driven, yet training-free framework that evolves simple TCSP heuristics into a data-driven selection and prioritization strategy. DANTE processes historical signals through a probabilistic stickiness metric that automatically adapts to project-specific failure and fixing patterns, eliminating the need for manual tuning. This history-aware logic is complemented by a lightweight, commit-aware component that captures the relevance between code changes and test artifacts. We evaluated DANTE on the Long-Running Test Suite (LRTS) dataset, focusing on the Java projects, which constitute the vast majority in LRTS (9 out of 10 in total) and comprise more than 21,000 CI builds with multi-hour test suites. The results show that DANTE consistently outperforms state-of-the-art cost-history cognizant heuristics and ML baselines in both test selection and prioritization, while remaining robust to flaky tests.

DANTE: Data-Driven Test Case Selection and Prioritization for Long-Running Test Suites

Simone Reale;Elisabetta Di Nitto;Luciano Baresi;Massimiliano Di Penta;Giovanni Quattrocchi
2026-01-01

Abstract

As software systems evolve, large test suites increasingly delay developer feedback in Continuous Integration and Delivery (CI/CD). Test Case Selection and Prioritization (TCSP) mitigates this issue, and recent evidence shows that simple heuristics, such as prioritizing recently failed or fast-running tests, often outperform sophisticated machine learning (ML) approaches, which incur high training costs and suffer from distribution shift. However, these simple heuristics remain rigid and fail to leverage the rich historical data continuously generated by the CI/CD pipelines. This paper introduces DANTE, a data-driven, yet training-free framework that evolves simple TCSP heuristics into a data-driven selection and prioritization strategy. DANTE processes historical signals through a probabilistic stickiness metric that automatically adapts to project-specific failure and fixing patterns, eliminating the need for manual tuning. This history-aware logic is complemented by a lightweight, commit-aware component that captures the relevance between code changes and test artifacts. We evaluated DANTE on the Long-Running Test Suite (LRTS) dataset, focusing on the Java projects, which constitute the vast majority in LRTS (9 out of 10 in total) and comprise more than 21,000 CI builds with multi-hour test suites. The results show that DANTE consistently outperforms state-of-the-art cost-history cognizant heuristics and ML baselines in both test selection and prioritization, while remaining robust to flaky tests.
2026
IEEE International Conference on Software Testing, Verification and Validation (ICST) 2026
File in questo prodotto:
File Dimensione Formato  
ex_ICSE_2026___SOLO_CLASSICO (12).pdf

accesso aperto

: Pre-Print (o Pre-Refereeing)
Dimensione 881.33 kB
Formato Adobe PDF
881.33 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1309638
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact