Data visualization is an important resource for decision makers to obtain information from large datasets. Based on the data obtained from either predictions or measurements, different strategies are combined and tested to reduce the energy demand, whilst keeping the indoor comfort at suitable level. Although the information expressed from data representation can significantly influence the decisions, little research has focused on extracting features from building measurements. This paper provides an in-depth view into representation of building data, and applies three dimensionality reduction algorithms Principle Component Analysis (PCA), autoencoder and t-Distributed Stochastic Neighbour Embedding (t-SNE) on measurements from a teaching building. Results show that whilst PCA returns linear representations, it also has the least data compression, which can be useful for obtaining more general features. On the other hand, t-SNE returns the most compressed data, which is suitable for seeking large margins within a dataset. However, t-SNE may be unsuitable for datasets with recurring step-like temporal profiles. Autoencoder is the best overall option, as they capture the nonlinearities within a dataset whilst avoiding excessive data compression. Fine-tuning the hyperparameters of studied the algorithms, and the perils of relying on poorly tuned models is discussed at the end of the study.
Unsupervised learning for feature projection: Extracting patterns from multidimensional building measurements
Khayatian F.;Dall'O' G.
2020-01-01
Abstract
Data visualization is an important resource for decision makers to obtain information from large datasets. Based on the data obtained from either predictions or measurements, different strategies are combined and tested to reduce the energy demand, whilst keeping the indoor comfort at suitable level. Although the information expressed from data representation can significantly influence the decisions, little research has focused on extracting features from building measurements. This paper provides an in-depth view into representation of building data, and applies three dimensionality reduction algorithms Principle Component Analysis (PCA), autoencoder and t-Distributed Stochastic Neighbour Embedding (t-SNE) on measurements from a teaching building. Results show that whilst PCA returns linear representations, it also has the least data compression, which can be useful for obtaining more general features. On the other hand, t-SNE returns the most compressed data, which is suitable for seeking large margins within a dataset. However, t-SNE may be unsuitable for datasets with recurring step-like temporal profiles. Autoencoder is the best overall option, as they capture the nonlinearities within a dataset whilst avoiding excessive data compression. Fine-tuning the hyperparameters of studied the algorithms, and the perils of relying on poorly tuned models is discussed at the end of the study.File | Dimensione | Formato | |
---|---|---|---|
Articolo E&B.pdf
accesso aperto
:
Publisher’s version
Dimensione
8.18 MB
Formato
Adobe PDF
|
8.18 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.