ORLF: A flexible framework for online record linkage and fusion

El Kindi Rezig, Eduard C. Dragut, Mourad Ouzzani, Ahmed Elmagarmid, Walid G. Aref

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

With the exponential growth of data on the Web comes the opportunity to integrate multiple sources to give more accurate answers to user queries. Upon retrieving records from multiple Web databases, a key task is to merge records that refer to the same real-world entity. We demonstrate ORLF (Online Record Linkage and Fusion), a flexible query-time record linkage and fusion framework. ORLF deduplicates newly arriving query results jointly with previously processed query results. We use an iterative caching solution that leverages query locality to effectively deduplicate newly incoming records with cached records. ORLF aims to deliver timely query answers that are duplicate-free and reflect knowledge collected from previous queries.

Original languageEnglish
Title of host publication2016 IEEE 32nd International Conference on Data Engineering, ICDE 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1378-1381
Number of pages4
ISBN (Electronic)9781509020195
DOIs
Publication statusPublished - 22 Jun 2016
Event32nd IEEE International Conference on Data Engineering, ICDE 2016 - Helsinki, Finland
Duration: 16 May 201620 May 2016

Other

Other32nd IEEE International Conference on Data Engineering, ICDE 2016
CountryFinland
CityHelsinki
Period16/5/1620/5/16

Fingerprint

Fusion reactions
Record linkage
Fusion
Query
World Wide Web

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computational Theory and Mathematics
  • Computer Graphics and Computer-Aided Design
  • Computer Networks and Communications
  • Information Systems
  • Information Systems and Management

Cite this

Rezig, E. K., Dragut, E. C., Ouzzani, M., Elmagarmid, A., & Aref, W. G. (2016). ORLF: A flexible framework for online record linkage and fusion. In 2016 IEEE 32nd International Conference on Data Engineering, ICDE 2016 (pp. 1378-1381). [7498349] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICDE.2016.7498349

ORLF : A flexible framework for online record linkage and fusion. / Rezig, El Kindi; Dragut, Eduard C.; Ouzzani, Mourad; Elmagarmid, Ahmed; Aref, Walid G.

2016 IEEE 32nd International Conference on Data Engineering, ICDE 2016. Institute of Electrical and Electronics Engineers Inc., 2016. p. 1378-1381 7498349.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Rezig, EK, Dragut, EC, Ouzzani, M, Elmagarmid, A & Aref, WG 2016, ORLF: A flexible framework for online record linkage and fusion. in 2016 IEEE 32nd International Conference on Data Engineering, ICDE 2016., 7498349, Institute of Electrical and Electronics Engineers Inc., pp. 1378-1381, 32nd IEEE International Conference on Data Engineering, ICDE 2016, Helsinki, Finland, 16/5/16. https://doi.org/10.1109/ICDE.2016.7498349
Rezig EK, Dragut EC, Ouzzani M, Elmagarmid A, Aref WG. ORLF: A flexible framework for online record linkage and fusion. In 2016 IEEE 32nd International Conference on Data Engineering, ICDE 2016. Institute of Electrical and Electronics Engineers Inc. 2016. p. 1378-1381. 7498349 https://doi.org/10.1109/ICDE.2016.7498349
Rezig, El Kindi ; Dragut, Eduard C. ; Ouzzani, Mourad ; Elmagarmid, Ahmed ; Aref, Walid G. / ORLF : A flexible framework for online record linkage and fusion. 2016 IEEE 32nd International Conference on Data Engineering, ICDE 2016. Institute of Electrical and Electronics Engineers Inc., 2016. pp. 1378-1381
@inproceedings{411b1347b2a14911981afc920ce37ffb,
title = "ORLF: A flexible framework for online record linkage and fusion",
abstract = "With the exponential growth of data on the Web comes the opportunity to integrate multiple sources to give more accurate answers to user queries. Upon retrieving records from multiple Web databases, a key task is to merge records that refer to the same real-world entity. We demonstrate ORLF (Online Record Linkage and Fusion), a flexible query-time record linkage and fusion framework. ORLF deduplicates newly arriving query results jointly with previously processed query results. We use an iterative caching solution that leverages query locality to effectively deduplicate newly incoming records with cached records. ORLF aims to deliver timely query answers that are duplicate-free and reflect knowledge collected from previous queries.",
author = "Rezig, {El Kindi} and Dragut, {Eduard C.} and Mourad Ouzzani and Ahmed Elmagarmid and Aref, {Walid G.}",
year = "2016",
month = "6",
day = "22",
doi = "10.1109/ICDE.2016.7498349",
language = "English",
pages = "1378--1381",
booktitle = "2016 IEEE 32nd International Conference on Data Engineering, ICDE 2016",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - ORLF

T2 - A flexible framework for online record linkage and fusion

AU - Rezig, El Kindi

AU - Dragut, Eduard C.

AU - Ouzzani, Mourad

AU - Elmagarmid, Ahmed

AU - Aref, Walid G.

PY - 2016/6/22

Y1 - 2016/6/22

N2 - With the exponential growth of data on the Web comes the opportunity to integrate multiple sources to give more accurate answers to user queries. Upon retrieving records from multiple Web databases, a key task is to merge records that refer to the same real-world entity. We demonstrate ORLF (Online Record Linkage and Fusion), a flexible query-time record linkage and fusion framework. ORLF deduplicates newly arriving query results jointly with previously processed query results. We use an iterative caching solution that leverages query locality to effectively deduplicate newly incoming records with cached records. ORLF aims to deliver timely query answers that are duplicate-free and reflect knowledge collected from previous queries.

AB - With the exponential growth of data on the Web comes the opportunity to integrate multiple sources to give more accurate answers to user queries. Upon retrieving records from multiple Web databases, a key task is to merge records that refer to the same real-world entity. We demonstrate ORLF (Online Record Linkage and Fusion), a flexible query-time record linkage and fusion framework. ORLF deduplicates newly arriving query results jointly with previously processed query results. We use an iterative caching solution that leverages query locality to effectively deduplicate newly incoming records with cached records. ORLF aims to deliver timely query answers that are duplicate-free and reflect knowledge collected from previous queries.

UR - http://www.scopus.com/inward/record.url?scp=84980320238&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84980320238&partnerID=8YFLogxK

U2 - 10.1109/ICDE.2016.7498349

DO - 10.1109/ICDE.2016.7498349

M3 - Conference contribution

AN - SCOPUS:84980320238

SP - 1378

EP - 1381

BT - 2016 IEEE 32nd International Conference on Data Engineering, ICDE 2016

PB - Institute of Electrical and Electronics Engineers Inc.

ER -