This paper investigates diversity queries over objects embedded in a low-dimensional vector space. An interesting case is provided by spatial Web objects, which are produced in great quantity by location-based services that let users attach content to places, and arise also in trip planning, news analysis, and real estate scenarios. The targeted queries aim at retrieving the best set of objects relevant to given user criteria and well distributed over a region of interest. Such queries are a particular case of diversified top-k queries, for which existing methods are too costly, as they evaluate diversity by accessing and scanning all relevant objects, even if only a small subset is needed. We therefore introduce Space Partitioning and Probing (SPP), an algorithm that minimizes the number of accessed objects while finding exactly the same result as MMR, the most popular diversification algorithm. SPP belongs to a family of algorithms that rely only on score-based and distance-based access methods, which are available in most geo-referenced Web data sources, and do not require retrieving all the relevant objects. Experiments show that SPP significantly reduces the number of accessed objects while incurring a very low computational overhead.

Top-k bounded diversification

FRATERNALI, PIERO;MARTINENGHI, DAVIDE;TAGLIASACCHI, MARCO
2012

Abstract

This paper investigates diversity queries over objects embedded in a low-dimensional vector space. An interesting case is provided by spatial Web objects, which are produced in great quantity by location-based services that let users attach content to places, and arise also in trip planning, news analysis, and real estate scenarios. The targeted queries aim at retrieving the best set of objects relevant to given user criteria and well distributed over a region of interest. Such queries are a particular case of diversified top-k queries, for which existing methods are too costly, as they evaluate diversity by accessing and scanning all relevant objects, even if only a small subset is needed. We therefore introduce Space Partitioning and Probing (SPP), an algorithm that minimizes the number of accessed objects while finding exactly the same result as MMR, the most popular diversification algorithm. SPP belongs to a family of algorithms that rely only on score-based and distance-based access methods, which are available in most geo-referenced Web data sources, and do not require retrieving all the relevant objects. Experiments show that SPP significantly reduces the number of accessed objects while incurring a very low computational overhead.
Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
9781450312479
File in questo prodotto:
File Dimensione Formato  
SIGMOD2012_Tagliasacchi.pdf

Accesso riservato

: Post-Print (DRAFT o Author’s Accepted Manuscript-AAM)
Dimensione 817.71 kB
Formato Adobe PDF
817.71 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11311/661630
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 43
  • ???jsp.display-item.citation.isi??? ND
social impact