Big data cleaning

Research output: Chapter in Book/Report/Conference proceedingConference contribution

18 Citations (Scopus)

Abstract

Data cleaning is, in fact, a lively subject that has played an important part in the history of data management and data analytics, and it still is undergoing rapid development. Moreover, data cleaning is considered as a main challenge in the era of big data, due to the increasing volume, velocity and variety of data in many applications. This paper aims to provide an overview of recent work in different aspects of data cleaning: error detection methods, data repairing algorithms, and a generalized data cleaning system. It also includes some discussion about our efforts of data cleaning methods from the perspective of big data, in terms of volume, velocity and variety.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PublisherSpringer Verlag
Pages13-24
Number of pages12
Volume8709 LNCS
ISBN (Print)9783319111155
DOIs
Publication statusPublished - 2014
Event16th Asia-Pacific Web Conference on Web Technologies and Applications, APWeb 2014 - Changsha, China
Duration: 5 Sep 20147 Sep 2014

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8709 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other16th Asia-Pacific Web Conference on Web Technologies and Applications, APWeb 2014
CountryChina
CityChangsha
Period5/9/147/9/14

Fingerprint

Cleaning
Error detection
Information management
Big data
Error Detection
Data Management

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Tang, N. (2014). Big data cleaning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8709 LNCS, pp. 13-24). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 8709 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-319-11116-2_2

Big data cleaning. / Tang, Nan.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 8709 LNCS Springer Verlag, 2014. p. 13-24 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 8709 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Tang, N 2014, Big data cleaning. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 8709 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 8709 LNCS, Springer Verlag, pp. 13-24, 16th Asia-Pacific Web Conference on Web Technologies and Applications, APWeb 2014, Changsha, China, 5/9/14. https://doi.org/10.1007/978-3-319-11116-2_2
Tang N. Big data cleaning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 8709 LNCS. Springer Verlag. 2014. p. 13-24. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-319-11116-2_2
Tang, Nan. / Big data cleaning. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 8709 LNCS Springer Verlag, 2014. pp. 13-24 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{c997d41ac78a46d9a6d96c3ef6e41bbd,
title = "Big data cleaning",
abstract = "Data cleaning is, in fact, a lively subject that has played an important part in the history of data management and data analytics, and it still is undergoing rapid development. Moreover, data cleaning is considered as a main challenge in the era of big data, due to the increasing volume, velocity and variety of data in many applications. This paper aims to provide an overview of recent work in different aspects of data cleaning: error detection methods, data repairing algorithms, and a generalized data cleaning system. It also includes some discussion about our efforts of data cleaning methods from the perspective of big data, in terms of volume, velocity and variety.",
author = "Nan Tang",
year = "2014",
doi = "10.1007/978-3-319-11116-2_2",
language = "English",
isbn = "9783319111155",
volume = "8709 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "13--24",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Big data cleaning

AU - Tang, Nan

PY - 2014

Y1 - 2014

N2 - Data cleaning is, in fact, a lively subject that has played an important part in the history of data management and data analytics, and it still is undergoing rapid development. Moreover, data cleaning is considered as a main challenge in the era of big data, due to the increasing volume, velocity and variety of data in many applications. This paper aims to provide an overview of recent work in different aspects of data cleaning: error detection methods, data repairing algorithms, and a generalized data cleaning system. It also includes some discussion about our efforts of data cleaning methods from the perspective of big data, in terms of volume, velocity and variety.

AB - Data cleaning is, in fact, a lively subject that has played an important part in the history of data management and data analytics, and it still is undergoing rapid development. Moreover, data cleaning is considered as a main challenge in the era of big data, due to the increasing volume, velocity and variety of data in many applications. This paper aims to provide an overview of recent work in different aspects of data cleaning: error detection methods, data repairing algorithms, and a generalized data cleaning system. It also includes some discussion about our efforts of data cleaning methods from the perspective of big data, in terms of volume, velocity and variety.

UR - http://www.scopus.com/inward/record.url?scp=84958547258&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84958547258&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-11116-2_2

DO - 10.1007/978-3-319-11116-2_2

M3 - Conference contribution

AN - SCOPUS:84958547258

SN - 9783319111155

VL - 8709 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 13

EP - 24

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

PB - Springer Verlag

ER -