RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

Finding the most relevant facts among dynamic and hetero- geneous data published on theWeb of Data is getting a growing attention in recent years. RDF Stream Processing (RSP) engines offer a baseline solution to integrate and process streaming data with data distributed on the Web. Unfortunately, the time to access and fetch the distributed data can be so high to put the RSP engine at risk of losing reactiveness, especially when the distributed data is slowly evolving. State of the art work addressed this problem by proposing an architectural solution that keeps a local replica of the distributed data and a baseline maintenance policy to refresh it over time. This doctoral thesis is investigating advance policies that let RSP engines continuously answer top-k queries, which require to join data streams with slowly evolving datasets published on the Web of Data, without violating the reactiveness constrains imposed by the users. In particular, it proposes policies that focus on freshing only the data in the replica that contributes to the correctness of the top-k results.

Retrieval of the most relevant facts from data streams joined with slowly evolving dataset published on the web of data

Zahmatkesh, Shima

2017-01-01

Abstract

Finding the most relevant facts among dynamic and hetero- geneous data published on theWeb of Data is getting a growing attention in recent years. RDF Stream Processing (RSP) engines offer a baseline solution to integrate and process streaming data with data distributed on the Web. Unfortunately, the time to access and fetch the distributed data can be so high to put the RSP engine at risk of losing reactiveness, especially when the distributed data is slowly evolving. State of the art work addressed this problem by proposing an architectural solution that keeps a local replica of the distributed data and a baseline maintenance policy to refresh it over time. This doctoral thesis is investigating advance policies that let RSP engines continuously answer top-k queries, which require to join data streams with slowly evolving datasets published on the Web of Data, without violating the reactiveness constrains imposed by the users. In particular, it proposes policies that focus on freshing only the data in the replica that contributes to the correctness of the top-k results.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2017
			
	Titolo del libro
	
				CEUR Workshop Proceedings
			
	Parole chiave
	
				Continuous SPARQL query processing; Distributed linked data; RDF stream; Top-K query processing; Computer Science (all)
			
	Appare nelle tipologie:
	
				04.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
paper_14.pdf accesso aperto : Publisher’s version Dimensione 222.05 kB Formato Adobe PDF Visualizza/Apri	222.05 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1038746

Citazioni

ND

0

ND

ND

social impact