Data di Pubblicazione:
2014
Citazione:
Discovering the topics of a data source: A statistical approach? / Bergamaschi, Sonia; Ferrari, Davide; Guerra, Francesco; Simonini, Giovanni. - 1310:(2014). ( Workshop on Surfacing the Deep and the Social Web, SDSW 2014, Co-located with the 13th International Semantic Web Conference, ISWC 2014 ita 2014).
Abstract:
In this paper, we present a preliminary approach for automatically discovering the topics of a structured data source with respect to a reference ontology. Our technique relies on a signature, i.e., a weighted graph that summarizes the content of a source. Graph-based approaches have been already used in the literature for similar purposes. In these proposals, the weights are typically assigned using traditional information-theoretical quantities such as entropy and mutual information. Here, we propose a novel data-driven technique based on composite likelihood to estimate the weights and other main features of the graphs, making the resulting approach less sensitive to overfitting. By means of a comparison of signatures, we can easily discover the topic of a target data source with respect to a reference ontology. This task is provided by a matching algorithm that retrieves the elements common to both the graphs. To illustrate our approach, we discuss a preliminary evaluation in the form of running example.
Tipologia CRIS:
Relazione in Atti di Convegno
Keywords:
topic discovery; entropy
Elenco autori:
Bergamaschi, Sonia; Ferrari, Davide; Guerra, Francesco; Simonini, Giovanni
Link alla scheda completa:
Titolo del libro:
CEUR Workshop Proceedings
Pubblicato in: