Reducing the high dimensionality of the original feature space through the use of feature selection algorithms is crucial in gene-expression-based predictive tasks to potentially improve performance and provide a better understanding of each feature’s power and biological meaning. Feature selection approaches like LASSO and other embedded techniques select small subsets of relevant features based solely on their quantitative contribution and predictive power, often leading to the selection of features with limited biological relevance. This work aims to provide a wide exploratory analysis of LASSO feature selection to assess the effects of different hyper-parameters on the selection of the most relevant features and their corresponding biological significance. Then, it introduces a new approach that can guide LASSO in the selection of the features by considering their predictive power as well as their biological relevance. With this intention, this work proposes a novel Gene Information Score to estimate each gene’s biological relevance and shows its use in enhancing the feature selection.
Enhancing Functional Interpretability in Gene Expression Analysis Through Biologically-Guided Feature Selection
Mongardi, Sofia;Cascianelli, Silvia;Masseroli, Marco
2025-01-01
Abstract
Reducing the high dimensionality of the original feature space through the use of feature selection algorithms is crucial in gene-expression-based predictive tasks to potentially improve performance and provide a better understanding of each feature’s power and biological meaning. Feature selection approaches like LASSO and other embedded techniques select small subsets of relevant features based solely on their quantitative contribution and predictive power, often leading to the selection of features with limited biological relevance. This work aims to provide a wide exploratory analysis of LASSO feature selection to assess the effects of different hyper-parameters on the selection of the most relevant features and their corresponding biological significance. Then, it introduces a new approach that can guide LASSO in the selection of the features by considering their predictive power as well as their biological relevance. With this intention, this work proposes a novel Gene Information Score to estimate each gene’s biological relevance and shows its use in enhancing the feature selection.| File | Dimensione | Formato | |
|---|---|---|---|
|
C39_CIBB_2023_LNBI_2025_293-307.pdf
Accesso riservato
:
Publisher’s version
Dimensione
486.49 kB
Formato
Adobe PDF
|
486.49 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


