Extended Spearman and Kendall coefficients for gene annotation list correlation

Chicco, Davide; Ciceri, Eleonora; Masseroli, Marco

Gene annotations are a key concept in bioinformatics and computational methods able to predict them are a fundamental contribution to the field. Several machine learning algorithms are available in this domain; they include relevant parameters that might influence the output list of predicted gene annotations. The amount that the variation of these key parameters affect the output gene annotation lists remains an open aspect to be evaluated. Here, we provide support for such evaluation by introducing two list correlation measures; they are based on and extend the Spearman ρ correlation coefficient and Kendall τ distance, respectively. The application of these measures to some gene annotation lists, predicted from Gene Ontology annotation datasets of different organisms’genes, showed interesting patterns between the predicted lists. Additionally, they allowed expressing some useful considerations about the prediction parameters and algorithms used.