Sailing the information ocean with awareness of currents

Discovery and application of source dependence

Laure Berti-Equille, Anish Das Sarma, Xin Luna Dong, Aḿelie Marian, Divesh Srivastava

Research output: Chapter in Book/Report/Conference proceedingConference contribution

43 Citations (Scopus)

Abstract

TheWeb has enabled the availability of a huge amount of useful information, but has also eased the ability to spread false information and rumors across multiple sources, making it hard to distinguish between what is true and what is not. Recent examples include the premature Steve Jobs obituary, the second bankruptcy of United airlines, the creation of Black Holes by the operation of the Large Hadron Collider, etc. Since it is important to permit the expression of dissenting and conflicting opinions, it would be a fallacy to try to ensure that the Web provides only consistent information. However, to help in separating the wheat from the chaff, it is essential to be able to determine dependence between sources. Given the huge number of data sources and the vast volume of conflicting data available on the Web, doing so in a scalable manner is extremely challenging and has not been addressed by existing work yet. In this paper, we present a set of research problems and propose some preliminary solutions on the issues involved in discovering dependence between sources. We also discuss how this knowledge can benefit a variety of technologies, such as data integration and Web 2.0, that help users manage and access the totality of the available information from various sources.

Original languageEnglish
Title of host publicationCIDR 2009 - 4th Biennal Conference on Innovative Data Systems Research
Publication statusPublished - 1 Dec 2009
Externally publishedYes
Event4th Biennal Conference on Innovative Data Systems Research, CIDR 2009 - Asilomar, CA, United States
Duration: 4 Jan 20097 Jan 2009

Other

Other4th Biennal Conference on Innovative Data Systems Research, CIDR 2009
CountryUnited States
CityAsilomar, CA
Period4/1/097/1/09

Fingerprint

Data integration
Colliding beam accelerators
Availability

ASJC Scopus subject areas

  • Information Systems

Cite this

Berti-Equille, L., Sarma, A. D., Dong, X. L., Marian, A., & Srivastava, D. (2009). Sailing the information ocean with awareness of currents: Discovery and application of source dependence. In CIDR 2009 - 4th Biennal Conference on Innovative Data Systems Research

Sailing the information ocean with awareness of currents : Discovery and application of source dependence. / Berti-Equille, Laure; Sarma, Anish Das; Dong, Xin Luna; Marian, Aḿelie; Srivastava, Divesh.

CIDR 2009 - 4th Biennal Conference on Innovative Data Systems Research. 2009.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Berti-Equille, L, Sarma, AD, Dong, XL, Marian, A & Srivastava, D 2009, Sailing the information ocean with awareness of currents: Discovery and application of source dependence. in CIDR 2009 - 4th Biennal Conference on Innovative Data Systems Research. 4th Biennal Conference on Innovative Data Systems Research, CIDR 2009, Asilomar, CA, United States, 4/1/09.
Berti-Equille L, Sarma AD, Dong XL, Marian A, Srivastava D. Sailing the information ocean with awareness of currents: Discovery and application of source dependence. In CIDR 2009 - 4th Biennal Conference on Innovative Data Systems Research. 2009
Berti-Equille, Laure ; Sarma, Anish Das ; Dong, Xin Luna ; Marian, Aḿelie ; Srivastava, Divesh. / Sailing the information ocean with awareness of currents : Discovery and application of source dependence. CIDR 2009 - 4th Biennal Conference on Innovative Data Systems Research. 2009.
@inproceedings{cba9b715147c468cb61d2e37132d8968,
title = "Sailing the information ocean with awareness of currents: Discovery and application of source dependence",
abstract = "TheWeb has enabled the availability of a huge amount of useful information, but has also eased the ability to spread false information and rumors across multiple sources, making it hard to distinguish between what is true and what is not. Recent examples include the premature Steve Jobs obituary, the second bankruptcy of United airlines, the creation of Black Holes by the operation of the Large Hadron Collider, etc. Since it is important to permit the expression of dissenting and conflicting opinions, it would be a fallacy to try to ensure that the Web provides only consistent information. However, to help in separating the wheat from the chaff, it is essential to be able to determine dependence between sources. Given the huge number of data sources and the vast volume of conflicting data available on the Web, doing so in a scalable manner is extremely challenging and has not been addressed by existing work yet. In this paper, we present a set of research problems and propose some preliminary solutions on the issues involved in discovering dependence between sources. We also discuss how this knowledge can benefit a variety of technologies, such as data integration and Web 2.0, that help users manage and access the totality of the available information from various sources.",
author = "Laure Berti-Equille and Sarma, {Anish Das} and Dong, {Xin Luna} and Aḿelie Marian and Divesh Srivastava",
year = "2009",
month = "12",
day = "1",
language = "English",
booktitle = "CIDR 2009 - 4th Biennal Conference on Innovative Data Systems Research",

}

TY - GEN

T1 - Sailing the information ocean with awareness of currents

T2 - Discovery and application of source dependence

AU - Berti-Equille, Laure

AU - Sarma, Anish Das

AU - Dong, Xin Luna

AU - Marian, Aḿelie

AU - Srivastava, Divesh

PY - 2009/12/1

Y1 - 2009/12/1

N2 - TheWeb has enabled the availability of a huge amount of useful information, but has also eased the ability to spread false information and rumors across multiple sources, making it hard to distinguish between what is true and what is not. Recent examples include the premature Steve Jobs obituary, the second bankruptcy of United airlines, the creation of Black Holes by the operation of the Large Hadron Collider, etc. Since it is important to permit the expression of dissenting and conflicting opinions, it would be a fallacy to try to ensure that the Web provides only consistent information. However, to help in separating the wheat from the chaff, it is essential to be able to determine dependence between sources. Given the huge number of data sources and the vast volume of conflicting data available on the Web, doing so in a scalable manner is extremely challenging and has not been addressed by existing work yet. In this paper, we present a set of research problems and propose some preliminary solutions on the issues involved in discovering dependence between sources. We also discuss how this knowledge can benefit a variety of technologies, such as data integration and Web 2.0, that help users manage and access the totality of the available information from various sources.

AB - TheWeb has enabled the availability of a huge amount of useful information, but has also eased the ability to spread false information and rumors across multiple sources, making it hard to distinguish between what is true and what is not. Recent examples include the premature Steve Jobs obituary, the second bankruptcy of United airlines, the creation of Black Holes by the operation of the Large Hadron Collider, etc. Since it is important to permit the expression of dissenting and conflicting opinions, it would be a fallacy to try to ensure that the Web provides only consistent information. However, to help in separating the wheat from the chaff, it is essential to be able to determine dependence between sources. Given the huge number of data sources and the vast volume of conflicting data available on the Web, doing so in a scalable manner is extremely challenging and has not been addressed by existing work yet. In this paper, we present a set of research problems and propose some preliminary solutions on the issues involved in discovering dependence between sources. We also discuss how this knowledge can benefit a variety of technologies, such as data integration and Web 2.0, that help users manage and access the totality of the available information from various sources.

UR - http://www.scopus.com/inward/record.url?scp=84858684903&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84858684903&partnerID=8YFLogxK

M3 - Conference contribution

BT - CIDR 2009 - 4th Biennal Conference on Innovative Data Systems Research

ER -