The article presents a textual Big Data analytics solution developed in a real setting as a part of a high-capacity document digitization and storage system. A software based on machine learning techniques performs automated extraction and processing of textual contents. The work focuses on performance and data confidence evaluation and describes the approach to computing a set of indicators for textual data quality. It then presents experimental results.
Data and Process Quality Evaluation in a Textual Big Data Archiving System
M. Fugini;J. Finocchi
2021-01-01
Abstract
The article presents a textual Big Data analytics solution developed in a real setting as a part of a high-capacity document digitization and storage system. A software based on machine learning techniques performs automated extraction and processing of textual contents. The work focuses on performance and data confidence evaluation and describes the approach to computing a set of indicators for textual data quality. It then presents experimental results.File in questo prodotto:
| File | Dimensione | Formato | |
|---|---|---|---|
|
JOCCH_R2.pdf
Accesso riservato
Descrizione: Articolo Principale
:
Publisher’s version
Dimensione
1.2 MB
Formato
Adobe PDF
|
1.2 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


