Measuring and constraining data quality with analytic workflows

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

One challenging aspects of data quality modeling and man-ways to express requirements on the quality of data. The paper presents a framework for specifying and checking constraints 011 data quality in HDBMS. The evaluation of quality of data (QoD) is based 011 the declaration of data quality metrics that arc computed and combined into so-called are designed as a composition of statistical methods and data mining techniques used to detect patterns of anomalies in the data sets. As metadata they are used to (e.g., completeness, freshness, consistency, accuracy). The paper proposes a query language extension for constraining data quality when querying both data and its associated QoD metadata. Probabilistic approximate constraints are checked to determine if the quality of data is (or not) acceptable to build quality-constrained query results.

Original languageEnglish
Title of host publicationCTIT workshop proceedings series
Pages103-112
Number of pages10
VolumeWP 08
Edition02
Publication statusPublished - 2008
Externally publishedYes
Event6th International Workshop on Quality in Databases, QDB 2008 and 3rd Workshop on Management of Uncertain Data, MUD 2008 - Auckland, New Zealand
Duration: 1 Aug 20081 Aug 2008

Other

Other6th International Workshop on Quality in Databases, QDB 2008 and 3rd Workshop on Management of Uncertain Data, MUD 2008
CountryNew Zealand
CityAuckland
Period1/8/081/8/08

Fingerprint

data quality
workflow
Metadata
Query languages
metadata
Data mining
Statistical methods
Chemical analysis
data mining
measuring
statistical method
anomaly
modeling
evaluation

ASJC Scopus subject areas

  • Information Systems
  • Geography, Planning and Development

Cite this

Berti-Equille, L. (2008). Measuring and constraining data quality with analytic workflows. In CTIT workshop proceedings series (02 ed., Vol. WP 08, pp. 103-112)

Measuring and constraining data quality with analytic workflows. / Berti-Equille, Laure.

CTIT workshop proceedings series. Vol. WP 08 02. ed. 2008. p. 103-112.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Berti-Equille, L 2008, Measuring and constraining data quality with analytic workflows. in CTIT workshop proceedings series. 02 edn, vol. WP 08, pp. 103-112, 6th International Workshop on Quality in Databases, QDB 2008 and 3rd Workshop on Management of Uncertain Data, MUD 2008, Auckland, New Zealand, 1/8/08.
Berti-Equille L. Measuring and constraining data quality with analytic workflows. In CTIT workshop proceedings series. 02 ed. Vol. WP 08. 2008. p. 103-112
Berti-Equille, Laure. / Measuring and constraining data quality with analytic workflows. CTIT workshop proceedings series. Vol. WP 08 02. ed. 2008. pp. 103-112
@inproceedings{a3c5acfbd0884ce797ebfe4f4357cae2,
title = "Measuring and constraining data quality with analytic workflows",
abstract = "One challenging aspects of data quality modeling and man-ways to express requirements on the quality of data. The paper presents a framework for specifying and checking constraints 011 data quality in HDBMS. The evaluation of quality of data (QoD) is based 011 the declaration of data quality metrics that arc computed and combined into so-called are designed as a composition of statistical methods and data mining techniques used to detect patterns of anomalies in the data sets. As metadata they are used to (e.g., completeness, freshness, consistency, accuracy). The paper proposes a query language extension for constraining data quality when querying both data and its associated QoD metadata. Probabilistic approximate constraints are checked to determine if the quality of data is (or not) acceptable to build quality-constrained query results.",
author = "Laure Berti-Equille",
year = "2008",
language = "English",
volume = "WP 08",
pages = "103--112",
booktitle = "CTIT workshop proceedings series",
edition = "02",

}

TY - GEN

T1 - Measuring and constraining data quality with analytic workflows

AU - Berti-Equille, Laure

PY - 2008

Y1 - 2008

N2 - One challenging aspects of data quality modeling and man-ways to express requirements on the quality of data. The paper presents a framework for specifying and checking constraints 011 data quality in HDBMS. The evaluation of quality of data (QoD) is based 011 the declaration of data quality metrics that arc computed and combined into so-called are designed as a composition of statistical methods and data mining techniques used to detect patterns of anomalies in the data sets. As metadata they are used to (e.g., completeness, freshness, consistency, accuracy). The paper proposes a query language extension for constraining data quality when querying both data and its associated QoD metadata. Probabilistic approximate constraints are checked to determine if the quality of data is (or not) acceptable to build quality-constrained query results.

AB - One challenging aspects of data quality modeling and man-ways to express requirements on the quality of data. The paper presents a framework for specifying and checking constraints 011 data quality in HDBMS. The evaluation of quality of data (QoD) is based 011 the declaration of data quality metrics that arc computed and combined into so-called are designed as a composition of statistical methods and data mining techniques used to detect patterns of anomalies in the data sets. As metadata they are used to (e.g., completeness, freshness, consistency, accuracy). The paper proposes a query language extension for constraining data quality when querying both data and its associated QoD metadata. Probabilistic approximate constraints are checked to determine if the quality of data is (or not) acceptable to build quality-constrained query results.

UR - http://www.scopus.com/inward/record.url?scp=84883005102&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84883005102&partnerID=8YFLogxK

M3 - Conference contribution

VL - WP 08

SP - 103

EP - 112

BT - CTIT workshop proceedings series

ER -