ANMAT: Automatic knowledge discovery and error detection through pattern functional dependencies

Abdulhakim Qahtan, Nan Tang, Mourad Ouzzani, Yang Cao, Michael Stonebraker

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Knowledge discovery is critical to successful data analytics. We propose a new type of meta-knowledge, namely pattern functional dependencies (PFDs), that combine patterns (or regex-like rules) and integrity constraints (ICs) to model the dependencies (or meta-knowledge) between partial values (or patterns) across different attributes in a table. PFDs go beyond the classical functional dependencies and their extensions. For instance, in an employee table, ID “F-9-107”, “F” determines the finance department. Moreover, a key application of PFDs is to use them to identify erroneous data; tuples that violate some PFDs. In this demonstration, attendees will experience the following features: (i) PFD discovery - automatically discover PFDs from (dirty) data in different domains; and (ii) Error detection with PFDs - we will show errors that are detected by PFDs but cannot be captured by existing approaches.

Original languageEnglish
Title of host publicationSIGMOD 2019 - Proceedings of the 2019 International Conference on Management of Data
PublisherAssociation for Computing Machinery
Pages1977-1980
Number of pages4
ISBN (Electronic)9781450356435
DOIs
Publication statusPublished - 25 Jun 2019
Event2019 International Conference on Management of Data, SIGMOD 2019 - Amsterdam, Netherlands
Duration: 30 Jun 20195 Jul 2019

Publication series

NameProceedings of the ACM SIGMOD International Conference on Management of Data
ISSN (Print)0730-8078

Conference

Conference2019 International Conference on Management of Data, SIGMOD 2019
CountryNetherlands
CityAmsterdam
Period30/6/195/7/19

    Fingerprint

Keywords

  • Constrained Patterns
  • Error Detection
  • Knowledge Discovery
  • Pattern Functional Dependencies

ASJC Scopus subject areas

  • Software
  • Information Systems

Cite this

Qahtan, A., Tang, N., Ouzzani, M., Cao, Y., & Stonebraker, M. (2019). ANMAT: Automatic knowledge discovery and error detection through pattern functional dependencies. In SIGMOD 2019 - Proceedings of the 2019 International Conference on Management of Data (pp. 1977-1980). (Proceedings of the ACM SIGMOD International Conference on Management of Data). Association for Computing Machinery. https://doi.org/10.1145/3299869.3320209