CerFix

A system for cleaning data with certain fixes

Wenfei Fan, Jianzhong Li, Shuai Ma, Nan Tang, Wenyuan Yu

Research output: Chapter in Book/Report/Conference proceedingChapter

9 Citations (Scopus)

Abstract

We present CerFix, a data cleaning system that finds certain fixes for tuples at the point of data entry, i.e., fixes that are guaranteed correct. It is based on master data, editing rules and certain regions. Given some attributes of an input tuple that are validated (assured correct), editing rules tell us what other attributes to fix and how to correct them with master data. A certain region is a set of attributes that, if validated, warrant a certain fix for the entire tuple. We demonstrate the following facilities provided by CerFix: (1) a region finder to identify certain regions; (2) a data monitor to find certain fixes for input tuples, by guiding users to validate a minimal number of attributes; and (3) an auditing module to show what attributes are fixed and where the correct values come from.

Original languageEnglish
Title of host publicationProceedings of the VLDB Endowment
Pages1375-1378
Number of pages4
Volume4
Edition12
Publication statusPublished - Aug 2011
Externally publishedYes

Fingerprint

Cleaning
Data acquisition

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Computer Science(all)

Cite this

Fan, W., Li, J., Ma, S., Tang, N., & Yu, W. (2011). CerFix: A system for cleaning data with certain fixes. In Proceedings of the VLDB Endowment (12 ed., Vol. 4, pp. 1375-1378)

CerFix : A system for cleaning data with certain fixes. / Fan, Wenfei; Li, Jianzhong; Ma, Shuai; Tang, Nan; Yu, Wenyuan.

Proceedings of the VLDB Endowment. Vol. 4 12. ed. 2011. p. 1375-1378.

Research output: Chapter in Book/Report/Conference proceedingChapter

Fan, W, Li, J, Ma, S, Tang, N & Yu, W 2011, CerFix: A system for cleaning data with certain fixes. in Proceedings of the VLDB Endowment. 12 edn, vol. 4, pp. 1375-1378.
Fan W, Li J, Ma S, Tang N, Yu W. CerFix: A system for cleaning data with certain fixes. In Proceedings of the VLDB Endowment. 12 ed. Vol. 4. 2011. p. 1375-1378
Fan, Wenfei ; Li, Jianzhong ; Ma, Shuai ; Tang, Nan ; Yu, Wenyuan. / CerFix : A system for cleaning data with certain fixes. Proceedings of the VLDB Endowment. Vol. 4 12. ed. 2011. pp. 1375-1378
@inbook{40f84ab645c24f539d3d758bac5befeb,
title = "CerFix: A system for cleaning data with certain fixes",
abstract = "We present CerFix, a data cleaning system that finds certain fixes for tuples at the point of data entry, i.e., fixes that are guaranteed correct. It is based on master data, editing rules and certain regions. Given some attributes of an input tuple that are validated (assured correct), editing rules tell us what other attributes to fix and how to correct them with master data. A certain region is a set of attributes that, if validated, warrant a certain fix for the entire tuple. We demonstrate the following facilities provided by CerFix: (1) a region finder to identify certain regions; (2) a data monitor to find certain fixes for input tuples, by guiding users to validate a minimal number of attributes; and (3) an auditing module to show what attributes are fixed and where the correct values come from.",
author = "Wenfei Fan and Jianzhong Li and Shuai Ma and Nan Tang and Wenyuan Yu",
year = "2011",
month = "8",
language = "English",
volume = "4",
pages = "1375--1378",
booktitle = "Proceedings of the VLDB Endowment",
edition = "12",

}

TY - CHAP

T1 - CerFix

T2 - A system for cleaning data with certain fixes

AU - Fan, Wenfei

AU - Li, Jianzhong

AU - Ma, Shuai

AU - Tang, Nan

AU - Yu, Wenyuan

PY - 2011/8

Y1 - 2011/8

N2 - We present CerFix, a data cleaning system that finds certain fixes for tuples at the point of data entry, i.e., fixes that are guaranteed correct. It is based on master data, editing rules and certain regions. Given some attributes of an input tuple that are validated (assured correct), editing rules tell us what other attributes to fix and how to correct them with master data. A certain region is a set of attributes that, if validated, warrant a certain fix for the entire tuple. We demonstrate the following facilities provided by CerFix: (1) a region finder to identify certain regions; (2) a data monitor to find certain fixes for input tuples, by guiding users to validate a minimal number of attributes; and (3) an auditing module to show what attributes are fixed and where the correct values come from.

AB - We present CerFix, a data cleaning system that finds certain fixes for tuples at the point of data entry, i.e., fixes that are guaranteed correct. It is based on master data, editing rules and certain regions. Given some attributes of an input tuple that are validated (assured correct), editing rules tell us what other attributes to fix and how to correct them with master data. A certain region is a set of attributes that, if validated, warrant a certain fix for the entire tuple. We demonstrate the following facilities provided by CerFix: (1) a region finder to identify certain regions; (2) a data monitor to find certain fixes for input tuples, by guiding users to validate a minimal number of attributes; and (3) an auditing module to show what attributes are fixed and where the correct values come from.

UR - http://www.scopus.com/inward/record.url?scp=84863765052&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84863765052&partnerID=8YFLogxK

M3 - Chapter

VL - 4

SP - 1375

EP - 1378

BT - Proceedings of the VLDB Endowment

ER -