Measuring and constraining data quality with analytic workflows

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

One challenging aspects of data quality modeling and man-ways to express requirements on the quality of data. The paper presents a framework for specifying and checking constraints 011 data quality in HDBMS. The evaluation of quality of data (QoD) is based 011 the declaration of data quality metrics that arc computed and combined into so-called are designed as a composition of statistical methods and data mining techniques used to detect patterns of anomalies in the data sets. As metadata they are used to (e.g., completeness, freshness, consistency, accuracy). The paper proposes a query language extension for constraining data quality when querying both data and its associated QoD metadata. Probabilistic approximate constraints are checked to determine if the quality of data is (or not) acceptable to build quality-constrained query results.

Original languageEnglish
Title of host publicationCTIT workshop proceedings series
Pages103-112
Number of pages10
VolumeWP 08
Edition02
Publication statusPublished - 2008
Externally publishedYes
Event6th International Workshop on Quality in Databases, QDB 2008 and 3rd Workshop on Management of Uncertain Data, MUD 2008 - Auckland, New Zealand
Duration: 1 Aug 20081 Aug 2008

Other

Other6th International Workshop on Quality in Databases, QDB 2008 and 3rd Workshop on Management of Uncertain Data, MUD 2008
CountryNew Zealand
CityAuckland
Period1/8/081/8/08

    Fingerprint

ASJC Scopus subject areas

  • Information Systems
  • Geography, Planning and Development

Cite this

Berti-Equille, L. (2008). Measuring and constraining data quality with analytic workflows. In CTIT workshop proceedings series (02 ed., Vol. WP 08, pp. 103-112)