Data quality problems beyond consistency and deduplication

Wenfei Fan, Floris Geerts, Shuai Ma, Nan Tang, Wenyuan Yu

Research output: Chapter in Book/Report/Conference proceedingChapter

9 Citations (Scopus)

Abstract

Recent work on data quality has primarily focused on data repairing algorithms for improving data consistency and record matching methods for data deduplication. This paper accentuates several other challenging issues that are essential to developing data cleaning systems, namely, error correction with performance guarantees, unification of data repairing and record matching, relative information completeness, and data currency. We provide an overview of recent advances in the study of these issues, and advocate the need for developing a logical framework for a uniform treatment of these issues.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages237-249
Number of pages13
Volume8000
DOIs
Publication statusPublished - 1 Dec 2013

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8000
ISSN (Print)03029743
ISSN (Electronic)16113349

    Fingerprint

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Fan, W., Geerts, F., Ma, S., Tang, N., & Yu, W. (2013). Data quality problems beyond consistency and deduplication. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8000, pp. 237-249). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 8000). https://doi.org/10.1007/978-3-642-41660-6-12