The validation of quantitative structure-property/activity relationships (QSPR/QSAR) is an important challenge of modern theoretical chemistry. Analysis of QSPRs which are obtained with various distribution into sub-systems of training and of testing can be a useful approach to estimate reliability of QSPR predictions. The balance of correlation is an approach for the building up of QSPR with using three components of available data: (a) sub-training set (developer), (b) calibration set (critic), and (c) test set (estimator). Computational experiments have shown that the probabilistic interdependence between the distribution of available data into sub-training set, calibration set, and test set and the average numbers of outliers in the test set exists.
The average numbers of outliers over groups of various splits into training and test sets: a criterion of the reliability of a QSPR? A case of water solubility
GINI, GIUSEPPINA;
2012-01-01
Abstract
The validation of quantitative structure-property/activity relationships (QSPR/QSAR) is an important challenge of modern theoretical chemistry. Analysis of QSPRs which are obtained with various distribution into sub-systems of training and of testing can be a useful approach to estimate reliability of QSPR predictions. The balance of correlation is an approach for the building up of QSPR with using three components of available data: (a) sub-training set (developer), (b) calibration set (critic), and (c) test set (estimator). Computational experiments have shown that the probabilistic interdependence between the distribution of available data into sub-training set, calibration set, and test set and the average numbers of outliers in the test set exists.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


