RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

Transferable biomolecular force fields are developed by fitting either ab initio or experimental data related to representative molecules and can then be used to model chemical entities that are similar to the ones they were developed for. However, once parametrized on a given dataset, they are difficult to refit once new chemical entities, sensing schemes, or functional forms are introduced. On the other hand, Machine Learning Force Fields (MLFF) have recently gained attention for their accuracy and ease of expanding their Applicability Domain (AD). Nonetheless, their prediction times make them incompatible with High-Throughput Virtual Screening (HTVS) requirements. In this work, we follow the inverse of the widely adopted approach with transferable force fields and propose a new condensation approach that takes advantage of machine learning algorithms to massively predict force field parameters. The generated numerical distributions are then condensed in a single value that captures in a statistical way the chemical variability of the underlying molecules sharing that specific force field parameter and giving rise to the distribution itself, improving 30x computational efficiency with limited reduction in predicted molecular geometries accuracy. When tested on the public release of the OpenFF Industry Benchmark Season 1 v1.1 dataset, the molecular structures optimized by minimizing the Potential Energy Surface built with condensed FF parameters only show a minor decrease in Root Mean Squared Deviation (RMSD) and Torsion Fingerprint Deviations (TFD) performances compared to those obtained using molecule-specific FF parameters predicted at runtime. To give more context, the original MLFF and its condensed version are evaluated with respect to several well-known transferable force fields widely used for biomolecular simulations.

Condensation of Force Field Parameters from Machine Learning Predicted Distributions for High-Throughput Virtual Screening Applications

Bonanni, Domenico;Zhang, Yuedong;Gadioli, Davide;Scarpellini, Gianluca;Morerio, Pietro;Del Bue, Alessio;Beccari, Andrea Rosario;Palermo, Gianluca

2025-01-01

Abstract

Transferable biomolecular force fields are developed by fitting either ab initio or experimental data related to representative molecules and can then be used to model chemical entities that are similar to the ones they were developed for. However, once parametrized on a given dataset, they are difficult to refit once new chemical entities, sensing schemes, or functional forms are introduced. On the other hand, Machine Learning Force Fields (MLFF) have recently gained attention for their accuracy and ease of expanding their Applicability Domain (AD). Nonetheless, their prediction times make them incompatible with High-Throughput Virtual Screening (HTVS) requirements. In this work, we follow the inverse of the widely adopted approach with transferable force fields and propose a new condensation approach that takes advantage of machine learning algorithms to massively predict force field parameters. The generated numerical distributions are then condensed in a single value that captures in a statistical way the chemical variability of the underlying molecules sharing that specific force field parameter and giving rise to the distribution itself, improving 30x computational efficiency with limited reduction in predicted molecular geometries accuracy. When tested on the public release of the OpenFF Industry Benchmark Season 1 v1.1 dataset, the molecular structures optimized by minimizing the Potential Energy Surface built with condensed FF parameters only show a minor decrease in Root Mean Squared Deviation (RMSD) and Torsion Fingerprint Deviations (TFD) performances compared to those obtained using molecule-specific FF parameters predicted at runtime. To give more context, the original MLFF and its condensed version are evaluated with respect to several well-known transferable force fields widely used for biomolecular simulations.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2025
			
	Titolo della rivista
	
				JOURNAL OF CHEMICAL INFORMATION AND MODELING
			
	Parole chiave
	
				Force Fields, MLFF, High Troughput Virtual Screening, Drug Discovery, HPC, Molecular Modeling
			
	Appare nelle tipologie:
	
				01.1 Articolo in Rivista

File in questo prodotto:

File	Dimensione	Formato
condensation-of-force-field-parameters-from-machine-learning-predicted-distributions-for-high-throughput-virtual.pdf accesso aperto : Publisher’s version Dimensione 3.26 MB Formato Adobe PDF Visualizza/Apri	3.26 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1303269

Citazioni

1

1

1

ND

social impact