The LLUNATIC data-cleaning framework

Floris Geerts, Giansalvatore Mecca, Paolo Papotti, Donatello Santoro

Research output: Chapter in Book/Report/Conference proceedingChapter

94 Citations (Scopus)

Abstract

Data-cleaning (or data-repairing) is considered a crucial problem in many database-related tasks. It consists in making a database consistent with respect to a set of given constraints. In recent years, repairing methods have been proposed for several classes of constraints. However, these methods rely on ad hoc decisions and tend to hard-code the strategy to repair conflicting values. As a consequence, there is currently no general algorithm to solve database repairing problems that involve different kinds of constraints and different strategies to select preferred values. In this paper we develop a uniform framework to solve this problem. We propose a new semantics for repairs, and a chase-based algorithm to compute minimal solutions. We implemented the framework in a DBMSbased prototype, and we report experimental results that confirm its good scalability and superior quality in computing repairs.

Original languageEnglish
Title of host publicationProceedings of the VLDB Endowment
Pages625-636
Number of pages12
Volume6
Edition9
Publication statusPublished - 2013

Fingerprint

Cleaning
Repair
Preferred numbers
Scalability
Semantics

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Computer Science(all)

Cite this

Geerts, F., Mecca, G., Papotti, P., & Santoro, D. (2013). The LLUNATIC data-cleaning framework. In Proceedings of the VLDB Endowment (9 ed., Vol. 6, pp. 625-636)

The LLUNATIC data-cleaning framework. / Geerts, Floris; Mecca, Giansalvatore; Papotti, Paolo; Santoro, Donatello.

Proceedings of the VLDB Endowment. Vol. 6 9. ed. 2013. p. 625-636.

Research output: Chapter in Book/Report/Conference proceedingChapter

Geerts, F, Mecca, G, Papotti, P & Santoro, D 2013, The LLUNATIC data-cleaning framework. in Proceedings of the VLDB Endowment. 9 edn, vol. 6, pp. 625-636.
Geerts F, Mecca G, Papotti P, Santoro D. The LLUNATIC data-cleaning framework. In Proceedings of the VLDB Endowment. 9 ed. Vol. 6. 2013. p. 625-636
Geerts, Floris ; Mecca, Giansalvatore ; Papotti, Paolo ; Santoro, Donatello. / The LLUNATIC data-cleaning framework. Proceedings of the VLDB Endowment. Vol. 6 9. ed. 2013. pp. 625-636
@inbook{cbf51f26d2474388b612e06a653cdd8e,
title = "The LLUNATIC data-cleaning framework",
abstract = "Data-cleaning (or data-repairing) is considered a crucial problem in many database-related tasks. It consists in making a database consistent with respect to a set of given constraints. In recent years, repairing methods have been proposed for several classes of constraints. However, these methods rely on ad hoc decisions and tend to hard-code the strategy to repair conflicting values. As a consequence, there is currently no general algorithm to solve database repairing problems that involve different kinds of constraints and different strategies to select preferred values. In this paper we develop a uniform framework to solve this problem. We propose a new semantics for repairs, and a chase-based algorithm to compute minimal solutions. We implemented the framework in a DBMSbased prototype, and we report experimental results that confirm its good scalability and superior quality in computing repairs.",
author = "Floris Geerts and Giansalvatore Mecca and Paolo Papotti and Donatello Santoro",
year = "2013",
language = "English",
volume = "6",
pages = "625--636",
booktitle = "Proceedings of the VLDB Endowment",
edition = "9",

}

TY - CHAP

T1 - The LLUNATIC data-cleaning framework

AU - Geerts, Floris

AU - Mecca, Giansalvatore

AU - Papotti, Paolo

AU - Santoro, Donatello

PY - 2013

Y1 - 2013

N2 - Data-cleaning (or data-repairing) is considered a crucial problem in many database-related tasks. It consists in making a database consistent with respect to a set of given constraints. In recent years, repairing methods have been proposed for several classes of constraints. However, these methods rely on ad hoc decisions and tend to hard-code the strategy to repair conflicting values. As a consequence, there is currently no general algorithm to solve database repairing problems that involve different kinds of constraints and different strategies to select preferred values. In this paper we develop a uniform framework to solve this problem. We propose a new semantics for repairs, and a chase-based algorithm to compute minimal solutions. We implemented the framework in a DBMSbased prototype, and we report experimental results that confirm its good scalability and superior quality in computing repairs.

AB - Data-cleaning (or data-repairing) is considered a crucial problem in many database-related tasks. It consists in making a database consistent with respect to a set of given constraints. In recent years, repairing methods have been proposed for several classes of constraints. However, these methods rely on ad hoc decisions and tend to hard-code the strategy to repair conflicting values. As a consequence, there is currently no general algorithm to solve database repairing problems that involve different kinds of constraints and different strategies to select preferred values. In this paper we develop a uniform framework to solve this problem. We propose a new semantics for repairs, and a chase-based algorithm to compute minimal solutions. We implemented the framework in a DBMSbased prototype, and we report experimental results that confirm its good scalability and superior quality in computing repairs.

UR - http://www.scopus.com/inward/record.url?scp=84882696854&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84882696854&partnerID=8YFLogxK

M3 - Chapter

AN - SCOPUS:84882696854

VL - 6

SP - 625

EP - 636

BT - Proceedings of the VLDB Endowment

ER -