Predicting relapse events is still one of the major challenges for breast cancer research. Despite gene expression-based classifiers may tackle this task, working on thousands of genes and only few samples jeopardizes the performances of a classifier trained without a proper gene selection. We propose a novel hybrid evolutionary gene selection framework, which uses a Multi-Objective Genetic Algorithm (MOGA) to search a wider range of gene selections and handles MOGA results in a whole new way, so as to overcome the limit of the non-easy interpretability of the MOGA broad set of solutions. To a classifier our framework provides a gene signature not only bringing the best cross-validation result, but also having noteworthy and robust performances when tested on unseen samples of an hold-out set. The robustness in hold-out showed the strength of our innovative key element: the final module of the framework, which fully exploits the high variability of MOGA outputs, rather than choosing just one of the solutions, as commonly done in the literature. It combines all MOGA results in more robust and compact gene occurrence-based signatures, under the reasonable assumption that highly recurrent genes have a more crucial biological role, more suitable clinical application and good discriminative power between relapsed and relapse-free patients, as confirmed by the obtained classification results.

Hybrid evolutionary framework for selection of genes predicting breast cancer relapse

L Perino;S Cascianelli;M Masseroli
2020-01-01

Abstract

Predicting relapse events is still one of the major challenges for breast cancer research. Despite gene expression-based classifiers may tackle this task, working on thousands of genes and only few samples jeopardizes the performances of a classifier trained without a proper gene selection. We propose a novel hybrid evolutionary gene selection framework, which uses a Multi-Objective Genetic Algorithm (MOGA) to search a wider range of gene selections and handles MOGA results in a whole new way, so as to overcome the limit of the non-easy interpretability of the MOGA broad set of solutions. To a classifier our framework provides a gene signature not only bringing the best cross-validation result, but also having noteworthy and robust performances when tested on unseen samples of an hold-out set. The robustness in hold-out showed the strength of our innovative key element: the final module of the framework, which fully exploits the high variability of MOGA outputs, rather than choosing just one of the solutions, as commonly done in the literature. It combines all MOGA results in more robust and compact gene occurrence-based signatures, under the reasonable assumption that highly recurrent genes have a more crucial biological role, more suitable clinical application and good discriminative power between relapsed and relapse-free patients, as confirmed by the obtained classification results.
2020
Proceedings of 2020 IEEE International Joint Conference on Neural Networks (IJCNN)
978-1-7281-6926-2
File in questo prodotto:
File Dimensione Formato  
PID6430905.pdf

accesso aperto

: Pre-Print (o Pre-Refereeing)
Dimensione 852.95 kB
Formato Adobe PDF
852.95 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1163001
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 0
social impact