FAHES

Detecting disguised missing values

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

It is well established that missing values, if not dealt with properly, may lead to poor data analytics models, misleading conclusions, and limitation in the generalization of findings. A key challenge in detecting these missing values is when they manifest themselves in a form that is otherwise valid, making it hard to distinguish them from other legitimate values. We propose to demonstrate FAHES, a system for detecting different types of disguised missing values (DMVs) which often occur in real world data. FAHES consists of several components, namely a profiler to generate rules for detecting repeated patterns, an outlier detection module, and a module to detect values that are used repeatedly in random records. Using several real world datasets, we will demonstrate how FAHES can easily catch DMVs.

Original languageEnglish
Title of host publicationProceedings - IEEE 34th International Conference on Data Engineering, ICDE 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1609-1612
Number of pages4
ISBN (Electronic)9781538655207
DOIs
Publication statusPublished - 24 Oct 2018
Event34th IEEE International Conference on Data Engineering, ICDE 2018 - Paris, France
Duration: 16 Apr 201819 Apr 2018

Other

Other34th IEEE International Conference on Data Engineering, ICDE 2018
CountryFrance
CityParis
Period16/4/1819/4/18

Fingerprint

Missing values
Module
Outlier detection

Keywords

  • Disguised missing values

ASJC Scopus subject areas

  • Information Systems
  • Information Systems and Management
  • Hardware and Architecture

Cite this

Qahtan, A., Elmagarmid, A., Ouzzani, M., & Tang, N. (2018). FAHES: Detecting disguised missing values. In Proceedings - IEEE 34th International Conference on Data Engineering, ICDE 2018 (pp. 1609-1612). [8509409] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICDE.2018.00188

FAHES : Detecting disguised missing values. / Qahtan, Abdulhakim; Elmagarmid, Ahmed; Ouzzani, Mourad; Tang, Nan.

Proceedings - IEEE 34th International Conference on Data Engineering, ICDE 2018. Institute of Electrical and Electronics Engineers Inc., 2018. p. 1609-1612 8509409.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Qahtan, A, Elmagarmid, A, Ouzzani, M & Tang, N 2018, FAHES: Detecting disguised missing values. in Proceedings - IEEE 34th International Conference on Data Engineering, ICDE 2018., 8509409, Institute of Electrical and Electronics Engineers Inc., pp. 1609-1612, 34th IEEE International Conference on Data Engineering, ICDE 2018, Paris, France, 16/4/18. https://doi.org/10.1109/ICDE.2018.00188
Qahtan A, Elmagarmid A, Ouzzani M, Tang N. FAHES: Detecting disguised missing values. In Proceedings - IEEE 34th International Conference on Data Engineering, ICDE 2018. Institute of Electrical and Electronics Engineers Inc. 2018. p. 1609-1612. 8509409 https://doi.org/10.1109/ICDE.2018.00188
Qahtan, Abdulhakim ; Elmagarmid, Ahmed ; Ouzzani, Mourad ; Tang, Nan. / FAHES : Detecting disguised missing values. Proceedings - IEEE 34th International Conference on Data Engineering, ICDE 2018. Institute of Electrical and Electronics Engineers Inc., 2018. pp. 1609-1612
@inproceedings{80287e109a314afa822c88406b945675,
title = "FAHES: Detecting disguised missing values",
abstract = "It is well established that missing values, if not dealt with properly, may lead to poor data analytics models, misleading conclusions, and limitation in the generalization of findings. A key challenge in detecting these missing values is when they manifest themselves in a form that is otherwise valid, making it hard to distinguish them from other legitimate values. We propose to demonstrate FAHES, a system for detecting different types of disguised missing values (DMVs) which often occur in real world data. FAHES consists of several components, namely a profiler to generate rules for detecting repeated patterns, an outlier detection module, and a module to detect values that are used repeatedly in random records. Using several real world datasets, we will demonstrate how FAHES can easily catch DMVs.",
keywords = "Disguised missing values",
author = "Abdulhakim Qahtan and Ahmed Elmagarmid and Mourad Ouzzani and Nan Tang",
year = "2018",
month = "10",
day = "24",
doi = "10.1109/ICDE.2018.00188",
language = "English",
pages = "1609--1612",
booktitle = "Proceedings - IEEE 34th International Conference on Data Engineering, ICDE 2018",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - FAHES

T2 - Detecting disguised missing values

AU - Qahtan, Abdulhakim

AU - Elmagarmid, Ahmed

AU - Ouzzani, Mourad

AU - Tang, Nan

PY - 2018/10/24

Y1 - 2018/10/24

N2 - It is well established that missing values, if not dealt with properly, may lead to poor data analytics models, misleading conclusions, and limitation in the generalization of findings. A key challenge in detecting these missing values is when they manifest themselves in a form that is otherwise valid, making it hard to distinguish them from other legitimate values. We propose to demonstrate FAHES, a system for detecting different types of disguised missing values (DMVs) which often occur in real world data. FAHES consists of several components, namely a profiler to generate rules for detecting repeated patterns, an outlier detection module, and a module to detect values that are used repeatedly in random records. Using several real world datasets, we will demonstrate how FAHES can easily catch DMVs.

AB - It is well established that missing values, if not dealt with properly, may lead to poor data analytics models, misleading conclusions, and limitation in the generalization of findings. A key challenge in detecting these missing values is when they manifest themselves in a form that is otherwise valid, making it hard to distinguish them from other legitimate values. We propose to demonstrate FAHES, a system for detecting different types of disguised missing values (DMVs) which often occur in real world data. FAHES consists of several components, namely a profiler to generate rules for detecting repeated patterns, an outlier detection module, and a module to detect values that are used repeatedly in random records. Using several real world datasets, we will demonstrate how FAHES can easily catch DMVs.

KW - Disguised missing values

UR - http://www.scopus.com/inward/record.url?scp=85051518156&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85051518156&partnerID=8YFLogxK

U2 - 10.1109/ICDE.2018.00188

DO - 10.1109/ICDE.2018.00188

M3 - Conference contribution

SP - 1609

EP - 1612

BT - Proceedings - IEEE 34th International Conference on Data Engineering, ICDE 2018

PB - Institute of Electrical and Electronics Engineers Inc.

ER -