In this paper we present Top Tom, a digital platform whose goal is to provide analytical and visual solutions for the exploration of a dynamic corpus of user‐generated messages and media articles, with the aim of i) distilling the information from thousands of documents in a low‐dimensional space of explainable topics, ii) cluster them in a hierarchical fashion while allowing to drill down to details and stories as constituents of the topics, iii) spotting trends and anomalies. Top Tom implements a batch processing pipeline able to run both in near‐real time with time stamped data from streaming sources and on historical data with a temporal dimension in a cold start mode. The resulting output unfolds along three main axes: time, volume and semantic similarity (i.e. topic hierarchical aggregation). To allow the browsing of data in a multiscale fashion and the identification of anomalous behaviors, three visual metaphors were adopted from biological and medical fields to design visualizations, i.e. the flowing of particles in a coherent stream, tomographic cross sectioning and contrast‐like analysis of biological tissues. The platform interface is composed by three main visualizations with coherent and smooth navigation interactions: calendar view, flow view, and temporal cut view. The integration of these three visual models with the multiscale analytic pipeline proposes a novel system for the identification and exploration of topics from unstructured texts. We evaluated the system using a collection of documents about the emerging opioid epidemics in the United States.

Topic tomographies (Toptom): A visual approach to distill information from media streams

GOBBO, BEATRICE;Mauri M.;Ciuccarelli P.
2019

Abstract

In this paper we present Top Tom, a digital platform whose goal is to provide analytical and visual solutions for the exploration of a dynamic corpus of user‐generated messages and media articles, with the aim of i) distilling the information from thousands of documents in a low‐dimensional space of explainable topics, ii) cluster them in a hierarchical fashion while allowing to drill down to details and stories as constituents of the topics, iii) spotting trends and anomalies. Top Tom implements a batch processing pipeline able to run both in near‐real time with time stamped data from streaming sources and on historical data with a temporal dimension in a cold start mode. The resulting output unfolds along three main axes: time, volume and semantic similarity (i.e. topic hierarchical aggregation). To allow the browsing of data in a multiscale fashion and the identification of anomalous behaviors, three visual metaphors were adopted from biological and medical fields to design visualizations, i.e. the flowing of particles in a coherent stream, tomographic cross sectioning and contrast‐like analysis of biological tissues. The platform interface is composed by three main visualizations with coherent and smooth navigation interactions: calendar view, flow view, and temporal cut view. The integration of these three visual models with the multiscale analytic pipeline proposes a novel system for the identification and exploration of topics from unstructured texts. We evaluated the system using a collection of documents about the emerging opioid epidemics in the United States.
Human-centered computing → Visualization
Information systems → Document topic models; Expert search
File in questo prodotto:
File Dimensione Formato  
[Gobbo et al. 2019] Topic Tomographies (TopTom)- a visual approach to distill information from media streams.pdf

Accesso riservato

: Publisher’s version
Dimensione 5.66 MB
Formato Adobe PDF
5.66 MB Adobe PDF   Visualizza/Apri
11311-1100266_Mauri.pdf

accesso aperto

: Post-Print (DRAFT o Author’s Accepted Manuscript-AAM)
Dimensione 6.38 MB
Formato Adobe PDF
6.38 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1100266
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 2
social impact