Fast exploration of vast text corpora is typically heavily time-consuming. Topic modeling allows for discovering key concepts in massive text datasets without requiring prior knowledge of their content. We built TETYS, an end-to-end topic modeling pipeline, easily configurable for processing and visualizing datasets. We demonstrate its use when applied to five datasets encompassing research on Sustainability Development Goals, defining the world’s most pressing social, economic, and environmental challenges. TETYS is based on neural topic modeling and exploits LLMs to be proficient in many domains, including research publications that range from human sciences to engineering and technology. In this demo, participants will be able to interact with the dashboard to discover insights about the datasets and appreciate/test temporal trends in their research topics. Tool: http://gmql.eu/tetys. Video: https://tinyurl.com/tetys-video. Code: https://github.com/FrInve/TETYS.
TETYS: Configurable Topic Modeling Exploration for Big Corpora of Text Documents
Francesco Invernici;Anna Bernasconi;Francesca Curati;Jelena Jakimov;Amirhossein Samavi
2025-01-01
Abstract
Fast exploration of vast text corpora is typically heavily time-consuming. Topic modeling allows for discovering key concepts in massive text datasets without requiring prior knowledge of their content. We built TETYS, an end-to-end topic modeling pipeline, easily configurable for processing and visualizing datasets. We demonstrate its use when applied to five datasets encompassing research on Sustainability Development Goals, defining the world’s most pressing social, economic, and environmental challenges. TETYS is based on neural topic modeling and exploits LLMs to be proficient in many domains, including research publications that range from human sciences to engineering and technology. In this demo, participants will be able to interact with the dashboard to discover insights about the datasets and appreciate/test temporal trends in their research topics. Tool: http://gmql.eu/tetys. Video: https://tinyurl.com/tetys-video. Code: https://github.com/FrInve/TETYS.| File | Dimensione | Formato | |
|---|---|---|---|
|
paper-323.pdf
accesso aperto
:
Publisher’s version
Dimensione
1.4 MB
Formato
Adobe PDF
|
1.4 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


