According to the utilization law, throughput and utilization are linearly related and their measurements can be used for the indirect estimation of service demands. In practice, however, hardware and software modifications as well as non-modeled loads due to periodic maintenance activities make the estimation process difficult and often impossible without manual intervention to analyze the data. Due to configuration changes, real world data sets show that workload and utilization measurements tend to group themselves into multiple linear clusters. To estimate the service demands of the underlying performance models, the different configurations have to be identified. In this paper, we present an algorithm that, exploiting the timestamps associated with each throughput and utilization observation, identifies the different configurations of the system and estimates the corresponding service demands. Our proposal is based on robust estimation and inference techniques and is therefore suitable to analyze contaminated data sets. Moreover, not only sudden and occasional changes of the system, but also recurring patterns in the system's behavior, due for instance to scheduled maintenance tasks, are detected. An efficient implementation of the algorithm has been made publicly available and, in this paper, its performance is assessed on synthetic as well as on experimental data.
Indirect estimation of service demands in the presence of structural changes
Cremonesi, Paolo;Sansottera, Andrea
2012-01-01
Abstract
According to the utilization law, throughput and utilization are linearly related and their measurements can be used for the indirect estimation of service demands. In practice, however, hardware and software modifications as well as non-modeled loads due to periodic maintenance activities make the estimation process difficult and often impossible without manual intervention to analyze the data. Due to configuration changes, real world data sets show that workload and utilization measurements tend to group themselves into multiple linear clusters. To estimate the service demands of the underlying performance models, the different configurations have to be identified. In this paper, we present an algorithm that, exploiting the timestamps associated with each throughput and utilization observation, identifies the different configurations of the system and estimates the corresponding service demands. Our proposal is based on robust estimation and inference techniques and is therefore suitable to analyze contaminated data sets. Moreover, not only sudden and occasional changes of the system, but also recurring patterns in the system's behavior, due for instance to scheduled maintenance tasks, are detected. An efficient implementation of the algorithm has been made publicly available and, in this paper, its performance is assessed on synthetic as well as on experimental data.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


