Disambiguated query suggestions and personalized content-similarity and novelty ranking of clustered results to optimize web searches

Bordogna, G.; Campi, Alessandro; Psaila, Giuseppe; Ronchi, Stefania

In this paper, we face the so called “ranked list problem” of Web searches, that occurs when users submit short requests to search engines. Generally, as a consequence of terms’ ambiguity and polysemy, users engage long cycles of query reformulation in an attempt to capture relevant information in the top ranked results. The overall objective of the proposal is to support the user in optimizing Web searches, by reducing the need for long search iterations. Specifically, in this paper we describe an iterative query disambiguation mechanism that follows three main phases. (1) The results of a Web search performed by the user (by submitting a query to a search engine) are clustered. (2) Clusters are ranked, based on a personalized balance of their content-similarity to the query and their novelty. (3) From each cluster, a disambiguated query that highlights the main contents of the cluster is generated, in such a way the new query is potentially capable to retrieve new documents, not previously retrieved; the disambiguated queries are suggestions for possibly new and more focused searches. The paper describes the proposal, illustrating a sample application of the mechanism. Finally, the paper presents a user’s evaluation experiment of the proposed approach, comparing it with common practice based on the direct use of search engines.