RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

The last years witnessed a steep rise in data generation worldwide and, consequently, the widespread adoption of software solutions claiming to support data intensive applications. Competitiveness and innovation have strongly benefited from these new platforms and methodologies, and there is a great deal of interest around the new possibilities that Big Data analytics promise to make reality. Many companies currently en- gage in data intensive processes as part of their core businesses; however, fully embracing the data-driven paradigm is still cumbersome, and es- tablishing a production-ready, fine-tuned deployment is time-consuming, expensive, and resource-intensive. This situation calls for novel models and techniques to streamline the process of deployment configuration for Big Data applications. In particular, the focus in this paper is on the rightsizing of Cloud deployed clusters, which represent a cost-effective alternative to installation on premises. We propose a novel tool, inte- grated in a wider DevOps-inspired approach, implementing a parallel and distributed simulation-optimization technique that efficiently and effec- tively explores the space of alternative resource configurations, seeking the minimum cost deployment that satisfies predefined quality of service constraints. The validity and relevance of the proposed solution has been thoroughly validated in a vast experimental campaign including different applications and Big Data platforms.

D-SPACE4Cloud: Towards Quality-Aware Data Intensive Applications in the Cloud

Eugenio Gianniti;Michele Ciavotta;Danilo Ardagna

In corso di stampa

Abstract

The last years witnessed a steep rise in data generation worldwide and, consequently, the widespread adoption of software solutions claiming to support data intensive applications. Competitiveness and innovation have strongly benefited from these new platforms and methodologies, and there is a great deal of interest around the new possibilities that Big Data analytics promise to make reality. Many companies currently en- gage in data intensive processes as part of their core businesses; however, fully embracing the data-driven paradigm is still cumbersome, and es- tablishing a production-ready, fine-tuned deployment is time-consuming, expensive, and resource-intensive. This situation calls for novel models and techniques to streamline the process of deployment configuration for Big Data applications. In particular, the focus in this paper is on the rightsizing of Cloud deployed clusters, which represent a cost-effective alternative to installation on premises. We propose a novel tool, inte- grated in a wider DevOps-inspired approach, implementing a parallel and distributed simulation-optimization technique that efficiently and effec- tively explores the space of alternative resource configurations, seeking the minimum cost deployment that satisfies predefined quality of service constraints. The validity and relevance of the proposed solution has been thoroughly validated in a vast experimental campaign including different applications and Big Data platforms.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				In corso di stampa
			
	Titolo della rivista
	
				IEEE TRANSACTIONS ON CLOUD COMPUTING
			
	Parole chiave
	
				Nonlinear programming, Performance of Systems, Distributed Systems
			
	Appare nelle tipologie:
	
				01.1 Articolo in Rivista

File in questo prodotto:

File	Dimensione	Formato
TCCGianniti.pdf accesso aperto Descrizione: Pre-Refereeing version : Post-Print (DRAFT o Author’s Accepted Manuscript-AAM) Dimensione 7.19 MB Formato Adobe PDF Visualizza/Apri	7.19 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1066156

Citazioni

ND

ND

ND

ND

social impact