Usage-based schema matching

Research output: Chapter in Book/Report/Conference proceedingConference contribution

36 Citations (Scopus)

Abstract

Existing techniques for schema matching are classified as either schema-based, instance-based, or a combination of both. In this paper, we define a new class of techniques, called usage-based schema matching. The idea is to exploit information extracted from the query logs to find correspondences between attributes in the schemas to be matched. We propose methods to identify co-occurrence patterns between attributes in addition to other features such as their use in joins and with aggregate functions. Several scoring functions are considered to measure the similarity of the extracted features, and a genetic algorithm is employed to find the highest-score mappings between the two schemas. Our technique is suitable for matching schemas even when their attribute names are opaque. It can further be combined with existing techniques to obtain more accurate results. Our experimental study demonstrates the effectiveness of the proposed approach and the benefit of combining it with other existing approaches.

Original languageEnglish
Title of host publicationProceedings - International Conference on Data Engineering
Pages20-29
Number of pages10
DOIs
Publication statusPublished - 1 Oct 2008
Externally publishedYes
Event2008 IEEE 24th International Conference on Data Engineering, ICDE'08 - Cancun, Mexico
Duration: 7 Apr 200812 Apr 2008

Other

Other2008 IEEE 24th International Conference on Data Engineering, ICDE'08
CountryMexico
CityCancun
Period7/4/0812/4/08

Fingerprint

Genetic algorithms

ASJC Scopus subject areas

  • Information Systems
  • Signal Processing
  • Software

Cite this

Elmeleegy, H., Ouzzani, M., & Elmagarmid, A. (2008). Usage-based schema matching. In Proceedings - International Conference on Data Engineering (pp. 20-29). [4497410] https://doi.org/10.1109/ICDE.2008.4497410

Usage-based schema matching. / Elmeleegy, Hazem; Ouzzani, Mourad; Elmagarmid, Ahmed.

Proceedings - International Conference on Data Engineering. 2008. p. 20-29 4497410.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Elmeleegy, H, Ouzzani, M & Elmagarmid, A 2008, Usage-based schema matching. in Proceedings - International Conference on Data Engineering., 4497410, pp. 20-29, 2008 IEEE 24th International Conference on Data Engineering, ICDE'08, Cancun, Mexico, 7/4/08. https://doi.org/10.1109/ICDE.2008.4497410
Elmeleegy H, Ouzzani M, Elmagarmid A. Usage-based schema matching. In Proceedings - International Conference on Data Engineering. 2008. p. 20-29. 4497410 https://doi.org/10.1109/ICDE.2008.4497410
Elmeleegy, Hazem ; Ouzzani, Mourad ; Elmagarmid, Ahmed. / Usage-based schema matching. Proceedings - International Conference on Data Engineering. 2008. pp. 20-29
@inproceedings{d385d73f80124fcf8337203ad6842ca9,
title = "Usage-based schema matching",
abstract = "Existing techniques for schema matching are classified as either schema-based, instance-based, or a combination of both. In this paper, we define a new class of techniques, called usage-based schema matching. The idea is to exploit information extracted from the query logs to find correspondences between attributes in the schemas to be matched. We propose methods to identify co-occurrence patterns between attributes in addition to other features such as their use in joins and with aggregate functions. Several scoring functions are considered to measure the similarity of the extracted features, and a genetic algorithm is employed to find the highest-score mappings between the two schemas. Our technique is suitable for matching schemas even when their attribute names are opaque. It can further be combined with existing techniques to obtain more accurate results. Our experimental study demonstrates the effectiveness of the proposed approach and the benefit of combining it with other existing approaches.",
author = "Hazem Elmeleegy and Mourad Ouzzani and Ahmed Elmagarmid",
year = "2008",
month = "10",
day = "1",
doi = "10.1109/ICDE.2008.4497410",
language = "English",
isbn = "9781424418374",
pages = "20--29",
booktitle = "Proceedings - International Conference on Data Engineering",

}

TY - GEN

T1 - Usage-based schema matching

AU - Elmeleegy, Hazem

AU - Ouzzani, Mourad

AU - Elmagarmid, Ahmed

PY - 2008/10/1

Y1 - 2008/10/1

N2 - Existing techniques for schema matching are classified as either schema-based, instance-based, or a combination of both. In this paper, we define a new class of techniques, called usage-based schema matching. The idea is to exploit information extracted from the query logs to find correspondences between attributes in the schemas to be matched. We propose methods to identify co-occurrence patterns between attributes in addition to other features such as their use in joins and with aggregate functions. Several scoring functions are considered to measure the similarity of the extracted features, and a genetic algorithm is employed to find the highest-score mappings between the two schemas. Our technique is suitable for matching schemas even when their attribute names are opaque. It can further be combined with existing techniques to obtain more accurate results. Our experimental study demonstrates the effectiveness of the proposed approach and the benefit of combining it with other existing approaches.

AB - Existing techniques for schema matching are classified as either schema-based, instance-based, or a combination of both. In this paper, we define a new class of techniques, called usage-based schema matching. The idea is to exploit information extracted from the query logs to find correspondences between attributes in the schemas to be matched. We propose methods to identify co-occurrence patterns between attributes in addition to other features such as their use in joins and with aggregate functions. Several scoring functions are considered to measure the similarity of the extracted features, and a genetic algorithm is employed to find the highest-score mappings between the two schemas. Our technique is suitable for matching schemas even when their attribute names are opaque. It can further be combined with existing techniques to obtain more accurate results. Our experimental study demonstrates the effectiveness of the proposed approach and the benefit of combining it with other existing approaches.

UR - http://www.scopus.com/inward/record.url?scp=52649088777&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=52649088777&partnerID=8YFLogxK

U2 - 10.1109/ICDE.2008.4497410

DO - 10.1109/ICDE.2008.4497410

M3 - Conference contribution

AN - SCOPUS:52649088777

SN - 9781424418374

SP - 20

EP - 29

BT - Proceedings - International Conference on Data Engineering

ER -