Transferable biomolecular force fields are developed by fitting either ab initio or experimental data related to representative molecules and can then be used to model chemical entities that are similar to the ones they were developed for. However, once parametrized on a given dataset, they are difficult to refit once new chemical entities, sensing schemes, or functional forms are introduced. On the other hand, Machine Learning Force Fields (MLFF) have recently gained attention for their accuracy and ease of expanding their Applicability Domain (AD). Nonetheless, their prediction times make them incompatible with High-Throughput Virtual Screening (HTVS) requirements. In this work, we follow the inverse of the widely adopted approach with transferable force fields and propose a new condensation approach that takes advantage of machine learning algorithms to massively predict force field parameters. The generated numerical distributions are then condensed in a single value that captures in a statistical way the chemical variability of the underlying molecules sharing that specific force field parameter and giving rise to the distribution itself, improving 30x computational efficiency with limited reduction in predicted molecular geometries accuracy. When tested on the public release of the OpenFF Industry Benchmark Season 1 v1.1 dataset, the molecular structures optimized by minimizing the Potential Energy Surface built with condensed FF parameters only show a minor decrease in Root Mean Squared Deviation (RMSD) and Torsion Fingerprint Deviations (TFD) performances compared to those obtained using molecule-specific FF parameters predicted at runtime. To give more context, the original MLFF and its condensed version are evaluated with respect to several well-known transferable force fields widely used for biomolecular simulations.
Condensation of Force Field Parameters from Machine Learning Predicted Distributions for High-Throughput Virtual Screening Applications
Zhang, Yuedong;Gadioli, Davide;Palermo, Gianluca
2025-01-01
Abstract
Transferable biomolecular force fields are developed by fitting either ab initio or experimental data related to representative molecules and can then be used to model chemical entities that are similar to the ones they were developed for. However, once parametrized on a given dataset, they are difficult to refit once new chemical entities, sensing schemes, or functional forms are introduced. On the other hand, Machine Learning Force Fields (MLFF) have recently gained attention for their accuracy and ease of expanding their Applicability Domain (AD). Nonetheless, their prediction times make them incompatible with High-Throughput Virtual Screening (HTVS) requirements. In this work, we follow the inverse of the widely adopted approach with transferable force fields and propose a new condensation approach that takes advantage of machine learning algorithms to massively predict force field parameters. The generated numerical distributions are then condensed in a single value that captures in a statistical way the chemical variability of the underlying molecules sharing that specific force field parameter and giving rise to the distribution itself, improving 30x computational efficiency with limited reduction in predicted molecular geometries accuracy. When tested on the public release of the OpenFF Industry Benchmark Season 1 v1.1 dataset, the molecular structures optimized by minimizing the Potential Energy Surface built with condensed FF parameters only show a minor decrease in Root Mean Squared Deviation (RMSD) and Torsion Fingerprint Deviations (TFD) performances compared to those obtained using molecule-specific FF parameters predicted at runtime. To give more context, the original MLFF and its condensed version are evaluated with respect to several well-known transferable force fields widely used for biomolecular simulations.| File | Dimensione | Formato | |
|---|---|---|---|
|
condensation-of-force-field-parameters-from-machine-learning-predicted-distributions-for-high-throughput-virtual.pdf
accesso aperto
:
Publisher’s version
Dimensione
3.26 MB
Formato
Adobe PDF
|
3.26 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


