RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

Large-scale information processing often relies on subset matching for data classification and routing. Examples are publish/subscribe and stream processing systems, database systems, social media, and information-centric networking. For instance, an advanced Twitter-like messaging service where users might follow specific publishers as well as specific topics encoded as tag sets must join a stream of published messages with the users and their preferred tag sets so that the user tag set is a subset of the message tags. Subset matching is an old but also notoriously difficult problem. We present TagMatch, a system that solves this problem by taking advantage of a hybrid CPU/GPU stream processing architecture. TagMatch targets large-scale applications with thousands of matching operations per seconds against hundreds of millions of tag sets. We evaluate Tag- Match on an advanced message streaming application, with very positive results both in absolute terms and in comparison with existing systems. As a notable example, our experiments demonstrate that TagMatch running on a single, commodity machine with two GPUs can easily sustain the traffic throughput of Twitter even augmented with expressive tag-based selection.

High-throughput subset matching on commodity GPU-based systems

ROGORA, DANIELE;Papalini, Michele;Khazaei, Koorosh;MARGARA, ALESSANDRO;CARZANIGA, ANTONIO;CUGOLA, GIANPAOLO

2017-01-01

Abstract

Large-scale information processing often relies on subset matching for data classification and routing. Examples are publish/subscribe and stream processing systems, database systems, social media, and information-centric networking. For instance, an advanced Twitter-like messaging service where users might follow specific publishers as well as specific topics encoded as tag sets must join a stream of published messages with the users and their preferred tag sets so that the user tag set is a subset of the message tags. Subset matching is an old but also notoriously difficult problem. We present TagMatch, a system that solves this problem by taking advantage of a hybrid CPU/GPU stream processing architecture. TagMatch targets large-scale applications with thousands of matching operations per seconds against hundreds of millions of tag sets. We evaluate Tag- Match on an advanced message streaming application, with very positive results both in absolute terms and in comparison with existing systems. As a notable example, our experiments demonstrate that TagMatch running on a single, commodity machine with two GPUs can easily sustain the traffic throughput of Twitter even augmented with expressive tag-based selection.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2017
			
	Titolo del libro
	
				Proceedings of the 12th European Conference on Computer Systems, EuroSys 2017
			
	ISBN (International Standard Book Number)
	
				9781450349383
			
	Parole chiave
	
				GPU-based processing; Message selection and dissemination; Subset matching; Computer Networks and Communications; Software; Information Systems; Hardware and Architecture; Control and Systems Engineering
			
	Appare nelle tipologie:
	
				04.1 Contributo in Atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1028888

Citazioni

ND

1

1

social impact