The data analytics group at the Qatar Computing Research Institute

George Beskales, Ihab F. Ilyas, Paolo Papotti, Gautam Das, Felix Naumann, Jorge Arnulfo Quiane Ruiz, Ahmed Elmagarmid, Mourad Ouzzani, Nan Tang

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

The Qatar Computing Research Institute (QCRI), a member of Qatar Foundation for Education, Science and Community Development, started its activities in early 2011. QCRI is focusing on tackling large-scale computing challenges that address national priorities for growth and development and that have global impact in computing research. DA@QCRI has built expertise focusing on three core data management challenges: extracting data from its natural digital habitat, integrating a large and evolving number of sources, and robust cleaning to assure data quality and validation. Cleaning data requires collecting and maintaining a massive amount of metadata, such as data violations, lineage of data changes, and possible data repairs. In addition, users need to understand better the current health of the data and the data cleaning process through summarization or samples of data errors before they can effectively guide any data cleaning process. Providing a scalable data cleaning solution requires efficient methods to generate, maintain, and access such metadata.

Original languageEnglish
Pages (from-to)33-38
Number of pages6
JournalSIGMOD Record
Volume41
Issue number4
DOIs
Publication statusPublished - 1 Jan 2013

Fingerprint

Cleaning
Metadata
Information management
Repair
Education
Health

ASJC Scopus subject areas

  • Software
  • Information Systems

Cite this

The data analytics group at the Qatar Computing Research Institute. / Beskales, George; Ilyas, Ihab F.; Papotti, Paolo; Das, Gautam; Naumann, Felix; Quiane Ruiz, Jorge Arnulfo; Elmagarmid, Ahmed; Ouzzani, Mourad; Tang, Nan.

In: SIGMOD Record, Vol. 41, No. 4, 01.01.2013, p. 33-38.

Research output: Contribution to journalArticle

Beskales, George ; Ilyas, Ihab F. ; Papotti, Paolo ; Das, Gautam ; Naumann, Felix ; Quiane Ruiz, Jorge Arnulfo ; Elmagarmid, Ahmed ; Ouzzani, Mourad ; Tang, Nan. / The data analytics group at the Qatar Computing Research Institute. In: SIGMOD Record. 2013 ; Vol. 41, No. 4. pp. 33-38.
@article{d7bdb029c81e430aaa7a8f561f5d30c8,
title = "The data analytics group at the Qatar Computing Research Institute",
abstract = "The Qatar Computing Research Institute (QCRI), a member of Qatar Foundation for Education, Science and Community Development, started its activities in early 2011. QCRI is focusing on tackling large-scale computing challenges that address national priorities for growth and development and that have global impact in computing research. DA@QCRI has built expertise focusing on three core data management challenges: extracting data from its natural digital habitat, integrating a large and evolving number of sources, and robust cleaning to assure data quality and validation. Cleaning data requires collecting and maintaining a massive amount of metadata, such as data violations, lineage of data changes, and possible data repairs. In addition, users need to understand better the current health of the data and the data cleaning process through summarization or samples of data errors before they can effectively guide any data cleaning process. Providing a scalable data cleaning solution requires efficient methods to generate, maintain, and access such metadata.",
author = "George Beskales and Ilyas, {Ihab F.} and Paolo Papotti and Gautam Das and Felix Naumann and {Quiane Ruiz}, {Jorge Arnulfo} and Ahmed Elmagarmid and Mourad Ouzzani and Nan Tang",
year = "2013",
month = "1",
day = "1",
doi = "10.1145/2430456.2430466",
language = "English",
volume = "41",
pages = "33--38",
journal = "SIGMOD Record",
issn = "0163-5808",
publisher = "Association for Computing Machinery (ACM)",
number = "4",

}

TY - JOUR

T1 - The data analytics group at the Qatar Computing Research Institute

AU - Beskales, George

AU - Ilyas, Ihab F.

AU - Papotti, Paolo

AU - Das, Gautam

AU - Naumann, Felix

AU - Quiane Ruiz, Jorge Arnulfo

AU - Elmagarmid, Ahmed

AU - Ouzzani, Mourad

AU - Tang, Nan

PY - 2013/1/1

Y1 - 2013/1/1

N2 - The Qatar Computing Research Institute (QCRI), a member of Qatar Foundation for Education, Science and Community Development, started its activities in early 2011. QCRI is focusing on tackling large-scale computing challenges that address national priorities for growth and development and that have global impact in computing research. DA@QCRI has built expertise focusing on three core data management challenges: extracting data from its natural digital habitat, integrating a large and evolving number of sources, and robust cleaning to assure data quality and validation. Cleaning data requires collecting and maintaining a massive amount of metadata, such as data violations, lineage of data changes, and possible data repairs. In addition, users need to understand better the current health of the data and the data cleaning process through summarization or samples of data errors before they can effectively guide any data cleaning process. Providing a scalable data cleaning solution requires efficient methods to generate, maintain, and access such metadata.

AB - The Qatar Computing Research Institute (QCRI), a member of Qatar Foundation for Education, Science and Community Development, started its activities in early 2011. QCRI is focusing on tackling large-scale computing challenges that address national priorities for growth and development and that have global impact in computing research. DA@QCRI has built expertise focusing on three core data management challenges: extracting data from its natural digital habitat, integrating a large and evolving number of sources, and robust cleaning to assure data quality and validation. Cleaning data requires collecting and maintaining a massive amount of metadata, such as data violations, lineage of data changes, and possible data repairs. In addition, users need to understand better the current health of the data and the data cleaning process through summarization or samples of data errors before they can effectively guide any data cleaning process. Providing a scalable data cleaning solution requires efficient methods to generate, maintain, and access such metadata.

UR - http://www.scopus.com/inward/record.url?scp=84872962050&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84872962050&partnerID=8YFLogxK

U2 - 10.1145/2430456.2430466

DO - 10.1145/2430456.2430466

M3 - Article

VL - 41

SP - 33

EP - 38

JO - SIGMOD Record

JF - SIGMOD Record

SN - 0163-5808

IS - 4

ER -