Nested mappings: Schema mapping reloaded

Ariel Fuxman, Mauricio A. Hernandez, Howard Ho, Renee J. Miller, Paolo Papotti, Lucian Popa

Research output: Chapter in Book/Report/Conference proceedingConference contribution

97 Citations (Scopus)

Abstract

Many problems in information integration rely on specifications, called schema mappings, that model the relationships between schémas. Schema mappings for both relational and nested data are well-known. In this work, we present a new formalism for schema mapping that ex-tends these existing formalisms in two significant ways. First, our nested mappings allow for nesting and correlation of mappings. This results in a natural programming paradigm that often yields more accurate specifications. In particular, we show that nested mappings can naturally preserve correlations among data that existing mapping formalisms cannot. We also show that using nested mappings for pur-Poses of exchanging data from a source to a target will result in less redundancy in the target data. The second extension to the mapping formalism is the ability to express, in a declarative way, grouping and data merging semantics. This semantics can be easily changed and customized to the integration task at hand. We present a new algorithm for the automatic generation of nested mappings from schema matchings (that is, simple element-to-element correspondences beween schemas). We have implemented this algorithm, along with algorithms for the generation of transformation queries (e.g., XQuery) ased on the nested mapping specification. We show that the generation algorithms scale well to large, highly nested schemas. We also show that using nested mappings in data exchange can drastically re-uce the execution cost of producing a target instance, particularly over large data sources, and can also dramatically improve the qual-ity of the generated data.

Original languageEnglish
Title of host publicationVLDB 2006 - Proceedings of the 32nd International Conference on Very Large Data Bases
Pages67-78
Number of pages12
Publication statusPublished - 1 Dec 2006
Externally publishedYes
Event32nd International Conference on Very Large Data Bases, VLDB 2006 - Seoul, Korea, Republic of
Duration: 12 Sep 200615 Sep 2006

Other

Other32nd International Conference on Very Large Data Bases, VLDB 2006
CountryKorea, Republic of
CitySeoul
Period12/9/0615/9/06

Fingerprint

Specifications
Semantics
Electronic data interchange
Merging
Redundancy
Costs
Formalism
Execution costs
Data exchange
Grouping
Schema matching
Programming
Query
Information integration
Data sources
Paradigm

ASJC Scopus subject areas

  • Hardware and Architecture
  • Information Systems
  • Software
  • Information Systems and Management

Cite this

Fuxman, A., Hernandez, M. A., Ho, H., Miller, R. J., Papotti, P., & Popa, L. (2006). Nested mappings: Schema mapping reloaded. In VLDB 2006 - Proceedings of the 32nd International Conference on Very Large Data Bases (pp. 67-78)

Nested mappings : Schema mapping reloaded. / Fuxman, Ariel; Hernandez, Mauricio A.; Ho, Howard; Miller, Renee J.; Papotti, Paolo; Popa, Lucian.

VLDB 2006 - Proceedings of the 32nd International Conference on Very Large Data Bases. 2006. p. 67-78.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Fuxman, A, Hernandez, MA, Ho, H, Miller, RJ, Papotti, P & Popa, L 2006, Nested mappings: Schema mapping reloaded. in VLDB 2006 - Proceedings of the 32nd International Conference on Very Large Data Bases. pp. 67-78, 32nd International Conference on Very Large Data Bases, VLDB 2006, Seoul, Korea, Republic of, 12/9/06.
Fuxman A, Hernandez MA, Ho H, Miller RJ, Papotti P, Popa L. Nested mappings: Schema mapping reloaded. In VLDB 2006 - Proceedings of the 32nd International Conference on Very Large Data Bases. 2006. p. 67-78
Fuxman, Ariel ; Hernandez, Mauricio A. ; Ho, Howard ; Miller, Renee J. ; Papotti, Paolo ; Popa, Lucian. / Nested mappings : Schema mapping reloaded. VLDB 2006 - Proceedings of the 32nd International Conference on Very Large Data Bases. 2006. pp. 67-78
@inproceedings{741f44e513f24f65a0a2ed78bdccef8d,
title = "Nested mappings: Schema mapping reloaded",
abstract = "Many problems in information integration rely on specifications, called schema mappings, that model the relationships between sch{\'e}mas. Schema mappings for both relational and nested data are well-known. In this work, we present a new formalism for schema mapping that ex-tends these existing formalisms in two significant ways. First, our nested mappings allow for nesting and correlation of mappings. This results in a natural programming paradigm that often yields more accurate specifications. In particular, we show that nested mappings can naturally preserve correlations among data that existing mapping formalisms cannot. We also show that using nested mappings for pur-Poses of exchanging data from a source to a target will result in less redundancy in the target data. The second extension to the mapping formalism is the ability to express, in a declarative way, grouping and data merging semantics. This semantics can be easily changed and customized to the integration task at hand. We present a new algorithm for the automatic generation of nested mappings from schema matchings (that is, simple element-to-element correspondences beween schemas). We have implemented this algorithm, along with algorithms for the generation of transformation queries (e.g., XQuery) ased on the nested mapping specification. We show that the generation algorithms scale well to large, highly nested schemas. We also show that using nested mappings in data exchange can drastically re-uce the execution cost of producing a target instance, particularly over large data sources, and can also dramatically improve the qual-ity of the generated data.",
author = "Ariel Fuxman and Hernandez, {Mauricio A.} and Howard Ho and Miller, {Renee J.} and Paolo Papotti and Lucian Popa",
year = "2006",
month = "12",
day = "1",
language = "English",
isbn = "1595933859",
pages = "67--78",
booktitle = "VLDB 2006 - Proceedings of the 32nd International Conference on Very Large Data Bases",

}

TY - GEN

T1 - Nested mappings

T2 - Schema mapping reloaded

AU - Fuxman, Ariel

AU - Hernandez, Mauricio A.

AU - Ho, Howard

AU - Miller, Renee J.

AU - Papotti, Paolo

AU - Popa, Lucian

PY - 2006/12/1

Y1 - 2006/12/1

N2 - Many problems in information integration rely on specifications, called schema mappings, that model the relationships between schémas. Schema mappings for both relational and nested data are well-known. In this work, we present a new formalism for schema mapping that ex-tends these existing formalisms in two significant ways. First, our nested mappings allow for nesting and correlation of mappings. This results in a natural programming paradigm that often yields more accurate specifications. In particular, we show that nested mappings can naturally preserve correlations among data that existing mapping formalisms cannot. We also show that using nested mappings for pur-Poses of exchanging data from a source to a target will result in less redundancy in the target data. The second extension to the mapping formalism is the ability to express, in a declarative way, grouping and data merging semantics. This semantics can be easily changed and customized to the integration task at hand. We present a new algorithm for the automatic generation of nested mappings from schema matchings (that is, simple element-to-element correspondences beween schemas). We have implemented this algorithm, along with algorithms for the generation of transformation queries (e.g., XQuery) ased on the nested mapping specification. We show that the generation algorithms scale well to large, highly nested schemas. We also show that using nested mappings in data exchange can drastically re-uce the execution cost of producing a target instance, particularly over large data sources, and can also dramatically improve the qual-ity of the generated data.

AB - Many problems in information integration rely on specifications, called schema mappings, that model the relationships between schémas. Schema mappings for both relational and nested data are well-known. In this work, we present a new formalism for schema mapping that ex-tends these existing formalisms in two significant ways. First, our nested mappings allow for nesting and correlation of mappings. This results in a natural programming paradigm that often yields more accurate specifications. In particular, we show that nested mappings can naturally preserve correlations among data that existing mapping formalisms cannot. We also show that using nested mappings for pur-Poses of exchanging data from a source to a target will result in less redundancy in the target data. The second extension to the mapping formalism is the ability to express, in a declarative way, grouping and data merging semantics. This semantics can be easily changed and customized to the integration task at hand. We present a new algorithm for the automatic generation of nested mappings from schema matchings (that is, simple element-to-element correspondences beween schemas). We have implemented this algorithm, along with algorithms for the generation of transformation queries (e.g., XQuery) ased on the nested mapping specification. We show that the generation algorithms scale well to large, highly nested schemas. We also show that using nested mappings in data exchange can drastically re-uce the execution cost of producing a target instance, particularly over large data sources, and can also dramatically improve the qual-ity of the generated data.

UR - http://www.scopus.com/inward/record.url?scp=84893842937&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84893842937&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84893842937

SN - 1595933859

SN - 9781595933850

SP - 67

EP - 78

BT - VLDB 2006 - Proceedings of the 32nd International Conference on Very Large Data Bases

ER -