Big RDF data cleaning

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

Without a shadow of a doubt, data cleaning has played an important part in the history of data management and data analytics. Possessing high quality data has been proven to be crucial for businesses to do data driven decision making, especially within the information age and the era of big data. Resource Description Framework (RDF) is a standard model for data interchange on the semantic web. However, it is known that RDF data is dirty, since many of them are automatically extracted from the web. In this paper, we will first revisit data quality problems appeared in RDF data. Although many efforts have been put to clean RDF data, unfortunately, most of them are based on laborious manual evaluation. We will also describe possible solutions that shed lights on (semi-)automatically cleaning (big) RDF data.

Original languageEnglish
Title of host publicationProceedings - International Conference on Data Engineering
PublisherIEEE Computer Society
Pages77-79
Number of pages3
Volume2015-June
ISBN (Print)9781479984411
DOIs
Publication statusPublished - 19 Jun 2015
Event2015 31st IEEE International Conference on Data Engineering Workshops, ICDEW 2015 - Seoul, Korea, Republic of
Duration: 13 Apr 201517 Apr 2015

Other

Other2015 31st IEEE International Conference on Data Engineering Workshops, ICDEW 2015
CountryKorea, Republic of
CitySeoul
Period13/4/1517/4/15

Fingerprint

Data description
Cleaning
Interchanges
Semantic Web
Information management
Decision making
Industry

ASJC Scopus subject areas

  • Information Systems
  • Signal Processing
  • Software

Cite this

Tang, N. (2015). Big RDF data cleaning. In Proceedings - International Conference on Data Engineering (Vol. 2015-June, pp. 77-79). [7129549] IEEE Computer Society. https://doi.org/10.1109/ICDEW.2015.7129549

Big RDF data cleaning. / Tang, Nan.

Proceedings - International Conference on Data Engineering. Vol. 2015-June IEEE Computer Society, 2015. p. 77-79 7129549.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Tang, N 2015, Big RDF data cleaning. in Proceedings - International Conference on Data Engineering. vol. 2015-June, 7129549, IEEE Computer Society, pp. 77-79, 2015 31st IEEE International Conference on Data Engineering Workshops, ICDEW 2015, Seoul, Korea, Republic of, 13/4/15. https://doi.org/10.1109/ICDEW.2015.7129549
Tang N. Big RDF data cleaning. In Proceedings - International Conference on Data Engineering. Vol. 2015-June. IEEE Computer Society. 2015. p. 77-79. 7129549 https://doi.org/10.1109/ICDEW.2015.7129549
Tang, Nan. / Big RDF data cleaning. Proceedings - International Conference on Data Engineering. Vol. 2015-June IEEE Computer Society, 2015. pp. 77-79
@inproceedings{af7a90e0708d4cd5a91628499f01a1b5,
title = "Big RDF data cleaning",
abstract = "Without a shadow of a doubt, data cleaning has played an important part in the history of data management and data analytics. Possessing high quality data has been proven to be crucial for businesses to do data driven decision making, especially within the information age and the era of big data. Resource Description Framework (RDF) is a standard model for data interchange on the semantic web. However, it is known that RDF data is dirty, since many of them are automatically extracted from the web. In this paper, we will first revisit data quality problems appeared in RDF data. Although many efforts have been put to clean RDF data, unfortunately, most of them are based on laborious manual evaluation. We will also describe possible solutions that shed lights on (semi-)automatically cleaning (big) RDF data.",
author = "Nan Tang",
year = "2015",
month = "6",
day = "19",
doi = "10.1109/ICDEW.2015.7129549",
language = "English",
isbn = "9781479984411",
volume = "2015-June",
pages = "77--79",
booktitle = "Proceedings - International Conference on Data Engineering",
publisher = "IEEE Computer Society",

}

TY - GEN

T1 - Big RDF data cleaning

AU - Tang, Nan

PY - 2015/6/19

Y1 - 2015/6/19

N2 - Without a shadow of a doubt, data cleaning has played an important part in the history of data management and data analytics. Possessing high quality data has been proven to be crucial for businesses to do data driven decision making, especially within the information age and the era of big data. Resource Description Framework (RDF) is a standard model for data interchange on the semantic web. However, it is known that RDF data is dirty, since many of them are automatically extracted from the web. In this paper, we will first revisit data quality problems appeared in RDF data. Although many efforts have been put to clean RDF data, unfortunately, most of them are based on laborious manual evaluation. We will also describe possible solutions that shed lights on (semi-)automatically cleaning (big) RDF data.

AB - Without a shadow of a doubt, data cleaning has played an important part in the history of data management and data analytics. Possessing high quality data has been proven to be crucial for businesses to do data driven decision making, especially within the information age and the era of big data. Resource Description Framework (RDF) is a standard model for data interchange on the semantic web. However, it is known that RDF data is dirty, since many of them are automatically extracted from the web. In this paper, we will first revisit data quality problems appeared in RDF data. Although many efforts have been put to clean RDF data, unfortunately, most of them are based on laborious manual evaluation. We will also describe possible solutions that shed lights on (semi-)automatically cleaning (big) RDF data.

UR - http://www.scopus.com/inward/record.url?scp=84944312794&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84944312794&partnerID=8YFLogxK

U2 - 10.1109/ICDEW.2015.7129549

DO - 10.1109/ICDEW.2015.7129549

M3 - Conference contribution

SN - 9781479984411

VL - 2015-June

SP - 77

EP - 79

BT - Proceedings - International Conference on Data Engineering

PB - IEEE Computer Society

ER -