The average numbers of outliers over groups of various splits into training and test sets: a criterion of the reliability of a QSPR? A case of water solubility