Virtualization technologies have enabled a new way of thinking of computing resources and cloud computing frameworks offer many pay-per-use solutions for renting these resources. Conventional physical servers had to be acquired, provisioned, and configured beforehand; virtual resources can be allocated on demand, and changes can be managed quickly. Deploying systems on virtualized resources allows one to allocate resources given the actual workload and KPIs of interest, but it requires that resource management be part of the system itself. Traditional application components must be augmented with probes and actuators to sense the application behavior and provision resources accordingly. Big data applications are a prominent example of these modern systems, and the paper discusses dynaSpark, that is, the work done by the authors to extend Spark standalone-A well-known framework widely used for parallel processing and big data applications-And augment it with resource management capabilities. It also introduces the key problems the integration and the particular batch applications bring in, and identifies additional aspects that are still to be taken into account and that would lead to a better solution.

Big-data applications as self-Adaptive systems of systems

Baresi L.;Denaro G.;Quattrocchi G.
2019-01-01

Abstract

Virtualization technologies have enabled a new way of thinking of computing resources and cloud computing frameworks offer many pay-per-use solutions for renting these resources. Conventional physical servers had to be acquired, provisioned, and configured beforehand; virtual resources can be allocated on demand, and changes can be managed quickly. Deploying systems on virtualized resources allows one to allocate resources given the actual workload and KPIs of interest, but it requires that resource management be part of the system itself. Traditional application components must be augmented with probes and actuators to sense the application behavior and provision resources accordingly. Big data applications are a prominent example of these modern systems, and the paper discusses dynaSpark, that is, the work done by the authors to extend Spark standalone-A well-known framework widely used for parallel processing and big data applications-And augment it with resource management capabilities. It also introduces the key problems the integration and the particular batch applications bring in, and identifies additional aspects that are still to be taken into account and that would lead to a better solution.
2019
Proceedings - 2019 IEEE 30th International Symposium on Software Reliability Engineering Workshops, ISSREW 2019
978-1-7281-5138-0
Big-data applications
Dynamic resource management
Spark
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1167258
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 2
social impact