The increasing amount of very large XML datasets available to casual users is a challenging problem for our community, and calls for an appropriate support to efficiently gather knowledge from these data. Data mining, already widely applied to extract frequent correlations of values from both structured and semi-structured datasets, is the appropriate field for knowledge elicitation. In this work we describe an approach to extract Tree-based association rules from XML documents. Such rules provide approximate, intensional information on both the structure and the content of XML documents, and can be stored in XML format to be queried later on. A prototype system demonstrates the effectiveness of the approach.

Mining tree-based frequent patterns from XML

MAZURAN, MIRJANA;QUINTARELLI, ELISA;TANCA, LETIZIA
2009

Abstract

The increasing amount of very large XML datasets available to casual users is a challenging problem for our community, and calls for an appropriate support to efficiently gather knowledge from these data. Data mining, already widely applied to extract frequent correlations of values from both structured and semi-structured datasets, is the appropriate field for knowledge elicitation. In this work we describe an approach to extract Tree-based association rules from XML documents. Such rules provide approximate, intensional information on both the structure and the content of XML documents, and can be stored in XML format to be queried later on. A prototype system demonstrates the effectiveness of the approach.
Flexible Query Answering Systems, 8th International Conference, {FQAS} 2009
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11311/553658
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 7
social impact