Machine learning frameworks offer significant potential for predicting the self-healing properties of Ultra-High-Performance Concrete (UHPC) and can also serve in the design of advanced durability-based concrete. Existing machine learning models, while exhibiting good generalization, are limited in prediction accuracy due to training on small datasets with outliers. Moreover, many studies lack comprehensive interpretive analyses, resulting in inadequate extraction of physical insights related to self-healing. This paper introduces an advanced data-driven framework to predict crack closure and chloride diffusion in cracked concrete. The framework has three main components: (1) The Tree-structured Parzen Estimator and 10-fold cross-validation are employed to optimize the hyperparameters of the model. (2) Isolation Forest and forward stepwise selection are used for data cleaning, identifying and removing outliers and irrelevant variables in the dataset. (3) Feature importance analysis, SHapley Additive exPlanations (SHAP), and Partial Dependence Plots (PDP) are utilized to explain and quantify the impact of features on model composition and prediction. This research framework utilizing Light Gradient Boosting Machine (LightGBM) as the base model has been established, characterized by high accuracy and strong generalization in predicting concrete self-healing and durability. Using interpretive tools, the crack width thresholds for healing and chloride transport has been quantified at 60–125 μm and 35–60 μm, respectively.

Optimized data-driven method to study the self-healing and durability of ultra-high performance concrete

Zhewen Huang;Estefania Cuenca Asensio;Liberato Ferrara
2025-01-01

Abstract

Machine learning frameworks offer significant potential for predicting the self-healing properties of Ultra-High-Performance Concrete (UHPC) and can also serve in the design of advanced durability-based concrete. Existing machine learning models, while exhibiting good generalization, are limited in prediction accuracy due to training on small datasets with outliers. Moreover, many studies lack comprehensive interpretive analyses, resulting in inadequate extraction of physical insights related to self-healing. This paper introduces an advanced data-driven framework to predict crack closure and chloride diffusion in cracked concrete. The framework has three main components: (1) The Tree-structured Parzen Estimator and 10-fold cross-validation are employed to optimize the hyperparameters of the model. (2) Isolation Forest and forward stepwise selection are used for data cleaning, identifying and removing outliers and irrelevant variables in the dataset. (3) Feature importance analysis, SHapley Additive exPlanations (SHAP), and Partial Dependence Plots (PDP) are utilized to explain and quantify the impact of features on model composition and prediction. This research framework utilizing Light Gradient Boosting Machine (LightGBM) as the base model has been established, characterized by high accuracy and strong generalization in predicting concrete self-healing and durability. Using interpretive tools, the crack width thresholds for healing and chloride transport has been quantified at 60–125 μm and 35–60 μm, respectively.
2025
Light gradient boosting machine, Model interpretability, Machine learning, Anomaly detection, Self-healing, Ultra-high performance concrete
File in questo prodotto:
File Dimensione Formato  
Huang et al EAAI 2025 reduced.pdf

accesso aperto

Descrizione: Huang et al EAAI 2025
: Publisher’s version
Dimensione 1.02 MB
Formato Adobe PDF
1.02 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1280966
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact