Dynamic Emulation Modelling (DEMo) is emerging as a viable solution to combine computationally intensive simulation models and dynamic optimization algorithms. A dynamic emulator is a low order surrogate of the simulation model identified over a sample data set generated by the original simulation model itself. When applied to large 3D models, any DEMo exercise does require a pre-processing of the exogenous drivers and state variables in order to reduce, by spatial aggregation, the high number of candidate variables to appear in the final emulator. This work describes a hybrid clustering-variable selection approach to automatically discover compact and relevant representations of high-dimensional data sets. Time series clustering is adopted to identify spatial structures by objectively organizing data into homogenous groups, where the within-group-object similarity is minimized. In particular, the proposed approach relies on a hierarchical agglomerative clustering method, which starts by placing each time-series in its own cluster, and then merges clusters into larger clusters, until a compact, yet informative, representation of the original variables can be processed with the Recursive Variable Selection - Iterative Input Selection algorithm, in order to single out the most relevant clusters. The approach is demonstrated on a real-world case study concerning the reduction of Delft3D, a spatially distributed hydrodynamic model used to simulate salt intrusion dynamics in the tropical lake of Marina Reservoir, Singapore. Results show that the proposed approach permits a parsimonious, though accurate, characterization of salinity concentration.
Improved dynamic emulation modeling by time series clustering: The case study of Marina Reservoir, Singapore
GALELLI, STEFANO;CAIETTI MARIN, STEFANIA;CASTELLETTI, ANDREA FRANCESCO;
2012-01-01
Abstract
Dynamic Emulation Modelling (DEMo) is emerging as a viable solution to combine computationally intensive simulation models and dynamic optimization algorithms. A dynamic emulator is a low order surrogate of the simulation model identified over a sample data set generated by the original simulation model itself. When applied to large 3D models, any DEMo exercise does require a pre-processing of the exogenous drivers and state variables in order to reduce, by spatial aggregation, the high number of candidate variables to appear in the final emulator. This work describes a hybrid clustering-variable selection approach to automatically discover compact and relevant representations of high-dimensional data sets. Time series clustering is adopted to identify spatial structures by objectively organizing data into homogenous groups, where the within-group-object similarity is minimized. In particular, the proposed approach relies on a hierarchical agglomerative clustering method, which starts by placing each time-series in its own cluster, and then merges clusters into larger clusters, until a compact, yet informative, representation of the original variables can be processed with the Recursive Variable Selection - Iterative Input Selection algorithm, in order to single out the most relevant clusters. The approach is demonstrated on a real-world case study concerning the reduction of Delft3D, a spatially distributed hydrodynamic model used to simulate salt intrusion dynamics in the tropical lake of Marina Reservoir, Singapore. Results show that the proposed approach permits a parsimonious, though accurate, characterization of salinity concentration.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.