This paper presents a novel semi-automatic approach to construct conceptual ontologies over structured data by exploiting both the schema and content of the input dataset. It effectively combines two well-founded database and data mining techniques, i.e., functional dependency discovery and association rule mining, to support domain experts in the construction of meaningful ontologies, tailored to the analyzed data, by using Description Logic (DL). To this aim, functional dependencies are first discovered to highlight valuable conceptual relationships among attributes of the data schema (i.e., among concepts). The set of discovered correlations effectively support analysts in the assertion of the Tbox ontological statements (i.e., the statements involving shared data conceptualizations and their relationships). Then, the analyst-validated dependencies are exploited to drive the association rule mining process. Association rules represent relevant and hidden correlations among data content and they are used to provide valuable knowledge at the instance level. The pushing of functional dependency constraints into the rule mining process allows analysts to look into and exploit only the most significant data item recurrences in the assertion of the Abox ontological statements (i.e., the statements involving concept instances and their relationships).
Semi-Automatic Ontology Construction by Exploiting Functional Dependencies and Association Rules
GARZA, PAOLO
2011-01-01
Abstract
This paper presents a novel semi-automatic approach to construct conceptual ontologies over structured data by exploiting both the schema and content of the input dataset. It effectively combines two well-founded database and data mining techniques, i.e., functional dependency discovery and association rule mining, to support domain experts in the construction of meaningful ontologies, tailored to the analyzed data, by using Description Logic (DL). To this aim, functional dependencies are first discovered to highlight valuable conceptual relationships among attributes of the data schema (i.e., among concepts). The set of discovered correlations effectively support analysts in the assertion of the Tbox ontological statements (i.e., the statements involving shared data conceptualizations and their relationships). Then, the analyst-validated dependencies are exploited to drive the association rule mining process. Association rules represent relevant and hidden correlations among data content and they are used to provide valuable knowledge at the instance level. The pushing of functional dependency constraints into the rule mining process allows analysts to look into and exploit only the most significant data item recurrences in the assertion of the Abox ontological statements (i.e., the statements involving concept instances and their relationships).File | Dimensione | Formato | |
---|---|---|---|
GarzaIJSWIS11.pdf
Accesso riservato
:
Post-Print (DRAFT o Author’s Accepted Manuscript-AAM)
Dimensione
803.61 kB
Formato
Adobe PDF
|
803.61 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.