In gene expression analysis, the high dimensionality and limited sample size often lead to instability and overfitting of predictive models. While feature selection algorithms are commonly used to identify the most predictive genes, traditional approaches tend to focus solely on quantitative contributions, which can limit the discovery of deeper biological insights. To address this, we propose a novel wrapper-based approach that integrates prior biological knowledge into the feature selection process. Our approach extends standard forward feature selection by iteratively adding the most promising gene while ensuring it provides biological value, computed from prior knowledge derived from publicly available data sources. Additionally, we apply the same concept to backward selection, iteratively removing features that contribute the least to the predictive performance while providing limited additional biological information.
Forward and Backward Feature Selection Guided by Prior Biological Knowledge for Enhanced Interpretability
Mongardi, Sofia;Cascianelli, Silvia;Masseroli, Marco
2025-01-01
Abstract
In gene expression analysis, the high dimensionality and limited sample size often lead to instability and overfitting of predictive models. While feature selection algorithms are commonly used to identify the most predictive genes, traditional approaches tend to focus solely on quantitative contributions, which can limit the discovery of deeper biological insights. To address this, we propose a novel wrapper-based approach that integrates prior biological knowledge into the feature selection process. Our approach extends standard forward feature selection by iteratively adding the most promising gene while ensuring it provides biological value, computed from prior knowledge derived from publicly available data sources. Additionally, we apply the same concept to backward selection, iteratively removing features that contribute the least to the predictive performance while providing limited additional biological information.| File | Dimensione | Formato | |
|---|---|---|---|
|
C50_CIBB_2024_LNBI_2025_233-247.pdf
Accesso riservato
:
Publisher’s version
Dimensione
638.22 kB
Formato
Adobe PDF
|
638.22 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


