RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

Machine learning is widely used to predict software defect-prone components, facilitating testing and improving application quality. In a recent meta-analysis on binary classification for software defect prediction, the so-called researcher bias -i.e., the group who conducts the study- has been shown to play a critical role; the analysis, however, featured using proper null hypothesis testing statistical analysis alone. Since the null hypothesis testing is based on the so-called p-value, which is not the desired likelihood of the null hypothesis, it suffers from several important drawbacks. This article presents a Bayesian analysis of the same dataset, which overcomes the pitfalls of the null hypothesis testing approach and relaxes the assumptions of the methods used in the previous study. While the Bayesian analysis in this article identifies the software metrics as the most influential factor for a classifier's performance, researcher bias is still the second most important factor: the precautions against researcher bias are still critical to consider in the scope of software defect prediction endeavors. Further on, to confirm this finding, we analyze the data with more advanced Bayesian modeling, according to which we identify (1) classifiers with better performance, (2) the datasets whose instances are harder to predict, and (3) the metrics that impact the performance of a classifier.

Bayesian Meta-Analysis of Software Defect Prediction With Machine Learning

Majid Mohammadi;Dario Di Nucci;Damian Andrew Tamburri

2023-01-01

Abstract

Machine learning is widely used to predict software defect-prone components, facilitating testing and improving application quality. In a recent meta-analysis on binary classification for software defect prediction, the so-called researcher bias -i.e., the group who conducts the study- has been shown to play a critical role; the analysis, however, featured using proper null hypothesis testing statistical analysis alone. Since the null hypothesis testing is based on the so-called p-value, which is not the desired likelihood of the null hypothesis, it suffers from several important drawbacks. This article presents a Bayesian analysis of the same dataset, which overcomes the pitfalls of the null hypothesis testing approach and relaxes the assumptions of the methods used in the previous study. While the Bayesian analysis in this article identifies the software metrics as the most influential factor for a classifier's performance, researcher bias is still the second most important factor: the precautions against researcher bias are still critical to consider in the scope of software defect prediction endeavors. Further on, to confirm this finding, we analyze the data with more advanced Bayesian modeling, according to which we identify (1) classifiers with better performance, (2) the datasets whose instances are harder to predict, and (3) the metrics that impact the performance of a classifier.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2023
			
	Titolo della rivista
	
				IEEE TRANSACTIONS ON INDUSTRIAL CYBER-PHYSICAL SYSTEMS
			
	Appare nelle tipologie:
	
				01.1 Articolo in Rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1263552

Citazioni

ND

ND

1

social impact