Interactive data repairing

The FALCON dive

Enzo Veltri, Donatello Santoro, Giansalvatore Mecca, Paolo Papotti, Jian He, Gouliang Li, Nan Tang

Research output: Contribution to conferencePaper

Abstract

In this paper we discuss Falcon, an interactive, deterministic, and declarative data cleaning system. Unlike traditional rule-based system, Falcon does not rely on the existence of a set of pre-defined data quality rules, but it encourages users to explore the data, identify possible problems, and make updates to fix them. The main technical challenge consists in finding a set of rules, expressed as sql update queries, that are semantically correct and that fixes the largest number of errors in the data. Falcon navigates the lattice by interacting with users to gradually checking the correctness of a set of rules. We have conducted extensive experiments using both real-world and synthetic datasets to show that Falcon can effectively communicate with users in data repairing.

Original languageEnglish
Pages267-274
Number of pages8
Publication statusPublished - 1 Jan 2017
Event25th Italian Symposium on Advanced Database Systems, SEBD 2017 - Squillace Lido, Catanzaro, Italy
Duration: 25 Jun 201729 Jun 2017

Other

Other25th Italian Symposium on Advanced Database Systems, SEBD 2017
CountryItaly
CitySquillace Lido, Catanzaro
Period25/6/1729/6/17

Fingerprint

Knowledge based systems
Cleaning
Experiments

ASJC Scopus subject areas

  • Software
  • Information Systems

Cite this

Veltri, E., Santoro, D., Mecca, G., Papotti, P., He, J., Li, G., & Tang, N. (2017). Interactive data repairing: The FALCON dive. 267-274. Paper presented at 25th Italian Symposium on Advanced Database Systems, SEBD 2017, Squillace Lido, Catanzaro, Italy.

Interactive data repairing : The FALCON dive. / Veltri, Enzo; Santoro, Donatello; Mecca, Giansalvatore; Papotti, Paolo; He, Jian; Li, Gouliang; Tang, Nan.

2017. 267-274 Paper presented at 25th Italian Symposium on Advanced Database Systems, SEBD 2017, Squillace Lido, Catanzaro, Italy.

Research output: Contribution to conferencePaper

Veltri, E, Santoro, D, Mecca, G, Papotti, P, He, J, Li, G & Tang, N 2017, 'Interactive data repairing: The FALCON dive' Paper presented at 25th Italian Symposium on Advanced Database Systems, SEBD 2017, Squillace Lido, Catanzaro, Italy, 25/6/17 - 29/6/17, pp. 267-274.
Veltri E, Santoro D, Mecca G, Papotti P, He J, Li G et al. Interactive data repairing: The FALCON dive. 2017. Paper presented at 25th Italian Symposium on Advanced Database Systems, SEBD 2017, Squillace Lido, Catanzaro, Italy.
Veltri, Enzo ; Santoro, Donatello ; Mecca, Giansalvatore ; Papotti, Paolo ; He, Jian ; Li, Gouliang ; Tang, Nan. / Interactive data repairing : The FALCON dive. Paper presented at 25th Italian Symposium on Advanced Database Systems, SEBD 2017, Squillace Lido, Catanzaro, Italy.8 p.
@conference{42c8626968e54185a24f938a665f0bb5,
title = "Interactive data repairing: The FALCON dive",
abstract = "In this paper we discuss Falcon, an interactive, deterministic, and declarative data cleaning system. Unlike traditional rule-based system, Falcon does not rely on the existence of a set of pre-defined data quality rules, but it encourages users to explore the data, identify possible problems, and make updates to fix them. The main technical challenge consists in finding a set of rules, expressed as sql update queries, that are semantically correct and that fixes the largest number of errors in the data. Falcon navigates the lattice by interacting with users to gradually checking the correctness of a set of rules. We have conducted extensive experiments using both real-world and synthetic datasets to show that Falcon can effectively communicate with users in data repairing.",
author = "Enzo Veltri and Donatello Santoro and Giansalvatore Mecca and Paolo Papotti and Jian He and Gouliang Li and Nan Tang",
year = "2017",
month = "1",
day = "1",
language = "English",
pages = "267--274",
note = "25th Italian Symposium on Advanced Database Systems, SEBD 2017 ; Conference date: 25-06-2017 Through 29-06-2017",

}

TY - CONF

T1 - Interactive data repairing

T2 - The FALCON dive

AU - Veltri, Enzo

AU - Santoro, Donatello

AU - Mecca, Giansalvatore

AU - Papotti, Paolo

AU - He, Jian

AU - Li, Gouliang

AU - Tang, Nan

PY - 2017/1/1

Y1 - 2017/1/1

N2 - In this paper we discuss Falcon, an interactive, deterministic, and declarative data cleaning system. Unlike traditional rule-based system, Falcon does not rely on the existence of a set of pre-defined data quality rules, but it encourages users to explore the data, identify possible problems, and make updates to fix them. The main technical challenge consists in finding a set of rules, expressed as sql update queries, that are semantically correct and that fixes the largest number of errors in the data. Falcon navigates the lattice by interacting with users to gradually checking the correctness of a set of rules. We have conducted extensive experiments using both real-world and synthetic datasets to show that Falcon can effectively communicate with users in data repairing.

AB - In this paper we discuss Falcon, an interactive, deterministic, and declarative data cleaning system. Unlike traditional rule-based system, Falcon does not rely on the existence of a set of pre-defined data quality rules, but it encourages users to explore the data, identify possible problems, and make updates to fix them. The main technical challenge consists in finding a set of rules, expressed as sql update queries, that are semantically correct and that fixes the largest number of errors in the data. Falcon navigates the lattice by interacting with users to gradually checking the correctness of a set of rules. We have conducted extensive experiments using both real-world and synthetic datasets to show that Falcon can effectively communicate with users in data repairing.

UR - http://www.scopus.com/inward/record.url?scp=85035050920&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85035050920&partnerID=8YFLogxK

M3 - Paper

SP - 267

EP - 274

ER -