Web data reconciliation: Models and experiences

Lorenzo Blanco, Valter Crescenzi, Paolo Merialdo, Paolo Papotti

Research output: Chapter in Book/Report/Conference proceedingChapter

2 Citations (Scopus)

Abstract

An increasing number of web sites offer structured information about recognizable concepts, relevant to many application domains, such as finance, sport, commercial products. However, web data is inherently imprecise and uncertain, and conflicting values can be provided by different web sources. Characterizing the uncertainty of web data represents an important issue and several models have been recently proposed in the literature. This chapter illustrates state-of-the-art Bayesan models to evaluate the quality of data extracted from the Web and reports the results of an extensive application of the models on real life web data. Experimental results show that for some applications even simple approaches can provide effective results, while sophisticated solutions are needed to obtain a more precise characterization of the uncertainty.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages1-15
Number of pages15
Volume7538
DOIs
Publication statusPublished - 1 Dec 2012

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7538
ISSN (Print)03029743
ISSN (Electronic)16113349

Fingerprint

Reconciliation
Finance
Sports
Websites
Model
Uncertainty
Experience
Evaluate
Experimental Results

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Blanco, L., Crescenzi, V., Merialdo, P., & Papotti, P. (2012). Web data reconciliation: Models and experiences. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7538, pp. 1-15). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7538). https://doi.org/10.1007/978-3-642-34213-4_1

Web data reconciliation : Models and experiences. / Blanco, Lorenzo; Crescenzi, Valter; Merialdo, Paolo; Papotti, Paolo.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 7538 2012. p. 1-15 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7538).

Research output: Chapter in Book/Report/Conference proceedingChapter

Blanco, L, Crescenzi, V, Merialdo, P & Papotti, P 2012, Web data reconciliation: Models and experiences. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 7538, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 7538, pp. 1-15. https://doi.org/10.1007/978-3-642-34213-4_1
Blanco L, Crescenzi V, Merialdo P, Papotti P. Web data reconciliation: Models and experiences. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 7538. 2012. p. 1-15. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-642-34213-4_1
Blanco, Lorenzo ; Crescenzi, Valter ; Merialdo, Paolo ; Papotti, Paolo. / Web data reconciliation : Models and experiences. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 7538 2012. pp. 1-15 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inbook{ec7a59e7c9e04eaeab78864d7b9624c7,
title = "Web data reconciliation: Models and experiences",
abstract = "An increasing number of web sites offer structured information about recognizable concepts, relevant to many application domains, such as finance, sport, commercial products. However, web data is inherently imprecise and uncertain, and conflicting values can be provided by different web sources. Characterizing the uncertainty of web data represents an important issue and several models have been recently proposed in the literature. This chapter illustrates state-of-the-art Bayesan models to evaluate the quality of data extracted from the Web and reports the results of an extensive application of the models on real life web data. Experimental results show that for some applications even simple approaches can provide effective results, while sophisticated solutions are needed to obtain a more precise characterization of the uncertainty.",
author = "Lorenzo Blanco and Valter Crescenzi and Paolo Merialdo and Paolo Papotti",
year = "2012",
month = "12",
day = "1",
doi = "10.1007/978-3-642-34213-4_1",
language = "English",
isbn = "9783642342127",
volume = "7538",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "1--15",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - CHAP

T1 - Web data reconciliation

T2 - Models and experiences

AU - Blanco, Lorenzo

AU - Crescenzi, Valter

AU - Merialdo, Paolo

AU - Papotti, Paolo

PY - 2012/12/1

Y1 - 2012/12/1

N2 - An increasing number of web sites offer structured information about recognizable concepts, relevant to many application domains, such as finance, sport, commercial products. However, web data is inherently imprecise and uncertain, and conflicting values can be provided by different web sources. Characterizing the uncertainty of web data represents an important issue and several models have been recently proposed in the literature. This chapter illustrates state-of-the-art Bayesan models to evaluate the quality of data extracted from the Web and reports the results of an extensive application of the models on real life web data. Experimental results show that for some applications even simple approaches can provide effective results, while sophisticated solutions are needed to obtain a more precise characterization of the uncertainty.

AB - An increasing number of web sites offer structured information about recognizable concepts, relevant to many application domains, such as finance, sport, commercial products. However, web data is inherently imprecise and uncertain, and conflicting values can be provided by different web sources. Characterizing the uncertainty of web data represents an important issue and several models have been recently proposed in the literature. This chapter illustrates state-of-the-art Bayesan models to evaluate the quality of data extracted from the Web and reports the results of an extensive application of the models on real life web data. Experimental results show that for some applications even simple approaches can provide effective results, while sophisticated solutions are needed to obtain a more precise characterization of the uncertainty.

UR - http://www.scopus.com/inward/record.url?scp=84893705112&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84893705112&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-34213-4_1

DO - 10.1007/978-3-642-34213-4_1

M3 - Chapter

AN - SCOPUS:84893705112

SN - 9783642342127

VL - 7538

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 1

EP - 15

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -