Data di Pubblicazione:
2020
Citazione:
Scaling up Record-level Matching Rules / Gagliardelli, L.; Simonini, G.; Bergamaschi, S.. - 2646:(2020), pp. 12-23. ( 28th Italian Symposium on Advanced Database Systems, SEBD 2020 ita 2020).
Abstract:
Record-level matching rules are chains of similarity join pred-icates on multiple attributes employed to join records that refer to the same real-world object when an explicit foreign key is not available on the data sets at hand. They are widely employed by data scientists and practitioners that work with data lakes, open data, and data in the wild. In this work we present a novel technique that allows to efficiently exe-cute record-level matching rules on parallel and distributed systems and demonstrate its efficiency on a real-wold data set.
Tipologia CRIS:
Relazione in Atti di Convegno
Keywords:
Data integration; Entity resolution; Parallel similarity join
Elenco autori:
Gagliardelli, L.; Simonini, G.; Bergamaschi, S.
Link alla scheda completa:
Titolo del libro:
CEUR Workshop Proceedings
Pubblicato in: