The present paper aims at determining the most influential features to be extracted from smart meter data to facilitate machine learning-based classification of non-residential buildings. Smart meter-driven remote estimation of the chosen characteristics (the buildings’ performance class, use type, and operation group) is significantly helpful in buildings’ commissioning, benchmarking, and diagnostics applications. As the first step, state-of-the-art feature selection methods and a proposed customized approach are utilized for determining the most influential parameters in the pool of temporal features, proposed in a previous study. Next, importance-in-prediction based features, generated from an hour-ahead load prediction pipeline, that can improve the classification accuracy are proposed and added as additional input parameters. Finally, interpretations about some of the most influential features for different classification targets are provided. The obtained results demonstrate that, while aiming at estimating the buildings’ use type, through performing feature selection and adding importance-in-prediction based features, the number of utilized features is reduced from 290 (initial pool of features proposed in a previous study) to 29, while also increasing the accuracy from 71% to 74%. Similarly, number of employed features for estimating the performance class is decreased from 224 to 17 and the achieved accuracy is improved from 56% to 62%. Finally, using only 6 selected features, compared to 287 features in the initial set, the obtained accuracy for the classification of operation group is increased from 98% to 100%. It is thus demonstrated that the proposed methodology, through selecting and utilizing notably fewer features, results in a notable simplification of the feature extraction procedures, improves the achieved accuracy, and facilitates providing interpretations about the reason behind the influence of some of the most important features.

Building characterization through smart meter data analytics: Determination of the most influential temporal and importance-in-prediction based features

Najafi B.;Depalo M.;Rinaldi F.;
2021-01-01

Abstract

The present paper aims at determining the most influential features to be extracted from smart meter data to facilitate machine learning-based classification of non-residential buildings. Smart meter-driven remote estimation of the chosen characteristics (the buildings’ performance class, use type, and operation group) is significantly helpful in buildings’ commissioning, benchmarking, and diagnostics applications. As the first step, state-of-the-art feature selection methods and a proposed customized approach are utilized for determining the most influential parameters in the pool of temporal features, proposed in a previous study. Next, importance-in-prediction based features, generated from an hour-ahead load prediction pipeline, that can improve the classification accuracy are proposed and added as additional input parameters. Finally, interpretations about some of the most influential features for different classification targets are provided. The obtained results demonstrate that, while aiming at estimating the buildings’ use type, through performing feature selection and adding importance-in-prediction based features, the number of utilized features is reduced from 290 (initial pool of features proposed in a previous study) to 29, while also increasing the accuracy from 71% to 74%. Similarly, number of employed features for estimating the performance class is decreased from 224 to 17 and the achieved accuracy is improved from 56% to 62%. Finally, using only 6 selected features, compared to 287 features in the initial set, the obtained accuracy for the classification of operation group is increased from 98% to 100%. It is thus demonstrated that the proposed methodology, through selecting and utilizing notably fewer features, results in a notable simplification of the feature extraction procedures, improves the achieved accuracy, and facilitates providing interpretations about the reason behind the influence of some of the most important features.
2021
Commercial building characterization
Feature extraction
Feature selection
Machine learning
Smart meter data analytics
File in questo prodotto:
File Dimensione Formato  
2021 Building characterization through smart meter data analytics Determination of the most influential temporal and importance in prediction based features.pdf

Accesso riservato

Descrizione: 2021 Building characterization through smart meter data analytics Determination
: Publisher’s version
Dimensione 4.22 MB
Formato Adobe PDF
4.22 MB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1207063
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 18
  • ???jsp.display-item.citation.isi??? 14
social impact