Cloud computing refers to maximizing efficiency by sharing computational and storage resources, while data-parallel systems exploit the resources available in the cloud to perform parallel transformations over large amounts of data. In the same line, considerable emphasis has been recently given to two apparently disjoint research topics: data-parallel, and eventually consistent, distributed systems. Declarative networking has been recently proposed to ease the task of programming in the cloud, by allowing the programmer to express only the desired result and leave the implementation details to the responsibility of the run-time system. In this context, we deem it appropriate to propose a study on a logic-programming-based computational model for eventually consistent, data-parallel systems, the keystone of which is provided by the recent finding that the class of programs that can be computed in an eventually consistent, coordination-free way is that of monotonic programs. This principle is called Consistency and Logical Monotonicity (CALM) and has been proven by Ameloot et al. for distributed, asynchronous settings. We advocate that CALM should be employed as a basic theoretical tool also for data-parallel systems, wherein computation usually proceeds synchronously in rounds and where communication is assumed to be reliable. We deem this problem relevant and interesting, especially for what concerns parallel dataflow optimizations. Nowadays, we are in fact witnessing an increasing concern about understanding which properties distinguish synchronous from asynchronous parallel processing, and when the latter can replace the former. It is general opinion that coordination-freedom can be seen as a major discriminant factor. In this work, we make the case that the current form of CALM does not hold in general for data-parallel systems, and show how, using novel techniques, the satisfiability of the CALM principle can still be obtained although just for the subclass of programs called connected monotonic queries. We complete the study with considerations on the relationships between our model and the one employed by Ameloot et al., showing that our techniques subsume the latter when the synchronization constraints imposed on the system are loosened.

A datalog-based computational model for coordination-free, data-parallel systems

TANCA, LETIZIA
2018-01-01

Abstract

Cloud computing refers to maximizing efficiency by sharing computational and storage resources, while data-parallel systems exploit the resources available in the cloud to perform parallel transformations over large amounts of data. In the same line, considerable emphasis has been recently given to two apparently disjoint research topics: data-parallel, and eventually consistent, distributed systems. Declarative networking has been recently proposed to ease the task of programming in the cloud, by allowing the programmer to express only the desired result and leave the implementation details to the responsibility of the run-time system. In this context, we deem it appropriate to propose a study on a logic-programming-based computational model for eventually consistent, data-parallel systems, the keystone of which is provided by the recent finding that the class of programs that can be computed in an eventually consistent, coordination-free way is that of monotonic programs. This principle is called Consistency and Logical Monotonicity (CALM) and has been proven by Ameloot et al. for distributed, asynchronous settings. We advocate that CALM should be employed as a basic theoretical tool also for data-parallel systems, wherein computation usually proceeds synchronously in rounds and where communication is assumed to be reliable. We deem this problem relevant and interesting, especially for what concerns parallel dataflow optimizations. Nowadays, we are in fact witnessing an increasing concern about understanding which properties distinguish synchronous from asynchronous parallel processing, and when the latter can replace the former. It is general opinion that coordination-freedom can be seen as a major discriminant factor. In this work, we make the case that the current form of CALM does not hold in general for data-parallel systems, and show how, using novel techniques, the satisfiability of the CALM principle can still be obtained although just for the subclass of programs called connected monotonic queries. We complete the study with considerations on the relationships between our model and the one employed by Ameloot et al., showing that our techniques subsume the latter when the synchronization constraints imposed on the system are loosened.
declarative networking, datalog, relational transducer, CALM conjecture, bulk synchronous parallel systems
File in questo prodotto:
File Dimensione Formato  
1405.7264.pdf

Accesso riservato

Descrizione: A datalog-based computational model for coordination-free, data-parallel systems
: Pre-Print (o Pre-Refereeing)
Dimensione 1.64 MB
Formato Adobe PDF
1.64 MB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1063061
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact