Skip to Main Content (Press Enter)

Logo UNIMORE
  • ×
  • Home
  • Corsi
  • Insegnamenti
  • Professioni
  • Persone
  • Pubblicazioni
  • Strutture
  • Terza Missione
  • Attività
  • Competenze

UNI-FIND
Logo UNIMORE

|

UNI-FIND

unimore.it
  • ×
  • Home
  • Corsi
  • Insegnamenti
  • Professioni
  • Persone
  • Pubblicazioni
  • Strutture
  • Terza Missione
  • Attività
  • Competenze
  1. Pubblicazioni

Entity Resolution On-Demand for Querying Dirty Datasets

Contributo in Atti di convegno
Data di Pubblicazione:
2023
Citazione:
Entity Resolution On-Demand for Querying Dirty Datasets / Simonini, Giovanni; Zecchini, Luca; Naumann, Felix; Bergamaschi, Sonia. - 3478:(2023), pp. 410-419. ( 31st Italian Symposium on Advanced Database Systems (SEBD 2023) Galzignano Terme (Padova), Italy July 2-5, 2023).
Abstract:
Entity Resolution (ER) is the process of identifying and merging records that refer to the same real-world entity. ER is usually applied as an expensive cleaning step on the entire data before consuming it, yet the relevance of cleaned entities ultimately depends on the user’s specific application, which may only require a small portion of the entities. We introduce BrewER, a framework designed to evaluate SQL SP queries on unclean data while progressively providing results as if they were obtained from cleaned data. BrewER aims at cleaning a single entity at a time, adhering to an ORDER BY predicate, thus it inherently supports top-k queries and stop-and-resume execution. This approach can save a significant amount of resources for various applications. BrewER has been implemented as an open-source Python library and can be seamlessly employed with existing ER tools and algorithms. We thoroughly demonstrated its efficiency through its evaluation on four real-world datasets.
Tipologia CRIS:
Relazione in Atti di Convegno
Keywords:
Data Integration; ELT; Entity Resolution
Elenco autori:
Simonini, Giovanni; Zecchini, Luca; Naumann, Felix; Bergamaschi, Sonia
Autori di Ateneo:
BERGAMASCHI Sonia
SIMONINI GIOVANNI
Link alla scheda completa:
https://iris.unimore.it/handle/11380/1317066
Link al Full Text:
https://iris.unimore.it//retrieve/handle/11380/1317066/592204/paper70.pdf
Titolo del libro:
Proceedings of the 31st Symposium of Advanced Database Systems, Galzignano Terme, Italy, July 2nd to 5th, 2023
Pubblicato in:
CEUR WORKSHOP PROCEEDINGS
Journal
CEUR WORKSHOP PROCEEDINGS
Series
  • Dati Generali

Dati Generali

URL

https://ceur-ws.org/Vol-3478/paper70.pdf
  • Utilizzo dei cookie

Realizzato con VIVO | Designed by Cineca | 26.5.0.0