The last years witnessed a steep rise in data generation worldwide and, consequently, the widespread adoption of software solutions claiming to support data intensive applications. Competitiveness and innovation have strongly benefited from these new platforms and methodologies, and there is a great deal of interest around the new possibilities that Big Data analytics promise to make reality. Many companies currently en- gage in data intensive processes as part of their core businesses; however, fully embracing the data-driven paradigm is still cumbersome, and es- tablishing a production-ready, fine-tuned deployment is time-consuming, expensive, and resource-intensive. This situation calls for novel models and techniques to streamline the process of deployment configuration for Big Data applications. In particular, the focus in this paper is on the rightsizing of Cloud deployed clusters, which represent a cost-effective alternative to installation on premises. We propose a novel tool, inte- grated in a wider DevOps-inspired approach, implementing a parallel and distributed simulation-optimization technique that efficiently and effec- tively explores the space of alternative resource configurations, seeking the minimum cost deployment that satisfies predefined quality of service constraints. The validity and relevance of the proposed solution has been thoroughly validated in a vast experimental campaign including different applications and Big Data platforms.

D-SPACE4Cloud: Towards Quality-Aware Data Intensive Applications in the Cloud

Eugenio Gianniti;Danilo Ardagna
In corso di stampa

Abstract

The last years witnessed a steep rise in data generation worldwide and, consequently, the widespread adoption of software solutions claiming to support data intensive applications. Competitiveness and innovation have strongly benefited from these new platforms and methodologies, and there is a great deal of interest around the new possibilities that Big Data analytics promise to make reality. Many companies currently en- gage in data intensive processes as part of their core businesses; however, fully embracing the data-driven paradigm is still cumbersome, and es- tablishing a production-ready, fine-tuned deployment is time-consuming, expensive, and resource-intensive. This situation calls for novel models and techniques to streamline the process of deployment configuration for Big Data applications. In particular, the focus in this paper is on the rightsizing of Cloud deployed clusters, which represent a cost-effective alternative to installation on premises. We propose a novel tool, inte- grated in a wider DevOps-inspired approach, implementing a parallel and distributed simulation-optimization technique that efficiently and effec- tively explores the space of alternative resource configurations, seeking the minimum cost deployment that satisfies predefined quality of service constraints. The validity and relevance of the proposed solution has been thoroughly validated in a vast experimental campaign including different applications and Big Data platforms.
Nonlinear programming, Performance of Systems, Distributed Systems
File in questo prodotto:
File Dimensione Formato  
TCCGianniti.pdf

accesso aperto

Descrizione: Pre-Refereeing version
: Pre-Print (o Pre-Refereeing)
Dimensione 7.19 MB
Formato Adobe PDF
7.19 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11311/1066156
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact