The last years witnessed a steep rise in data generation worldwide and, consequently, the widespread adoption of software solutions able to support data-intensive application. Competitiveness and innovation have strongly benefited from these new platforms and methodologies, and there is a great deal of interest around the new possibilities that Big Data analytics promise to make reality. Many companies currently engage in data-intensive processes as part of their core businesses; however, fully embracing the data-driven paradigm is still cumbersome, and establishing a production-ready, fine-tuned deployment is time-consuming, expensive, and resource-intensive. This situation calls for innovative models and techniques to streamline the process of deployment configuration for Big Data applications. In particular, the focus in this paper is on the rightsizing of Cloud deployed clusters, which represent a cost-effective alternative to installation on premises. This paper proposes a novel tool, integrated in a wider DevOps-inspired approach, implementing a parallel and distributed simulation-optimization technique that efficiently and effectively explores the space of alternative Cloud configurations, seeking the minimum cost deployment that satisfies quality of service constraints. The soundness of the proposed solution has been thoroughly validated in a vast experimental campaign encompassing different applications and Big Data platforms.

Optimizing Quality-Aware Big Data Applications in the Cloud

E. Gianniti;M. Ciavotta;D. Ardagna
2021-01-01

Abstract

The last years witnessed a steep rise in data generation worldwide and, consequently, the widespread adoption of software solutions able to support data-intensive application. Competitiveness and innovation have strongly benefited from these new platforms and methodologies, and there is a great deal of interest around the new possibilities that Big Data analytics promise to make reality. Many companies currently engage in data-intensive processes as part of their core businesses; however, fully embracing the data-driven paradigm is still cumbersome, and establishing a production-ready, fine-tuned deployment is time-consuming, expensive, and resource-intensive. This situation calls for innovative models and techniques to streamline the process of deployment configuration for Big Data applications. In particular, the focus in this paper is on the rightsizing of Cloud deployed clusters, which represent a cost-effective alternative to installation on premises. This paper proposes a novel tool, integrated in a wider DevOps-inspired approach, implementing a parallel and distributed simulation-optimization technique that efficiently and effectively explores the space of alternative Cloud configurations, seeking the minimum cost deployment that satisfies quality of service constraints. The soundness of the proposed solution has been thoroughly validated in a vast experimental campaign encompassing different applications and Big Data platforms.
2021
Cloud computing , Big Data , Unified modeling language
File in questo prodotto:
File Dimensione Formato  
main.pdf

accesso aperto

: Post-Print (DRAFT o Author’s Accepted Manuscript-AAM)
Dimensione 7.15 MB
Formato Adobe PDF
7.15 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1158136
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? 2
social impact