Entity identification in database integration

Ee Peng Lim, Jaideep Srivastava, Satya Prabhakar, James Richardson

Research output: Chapter in Book/Report/Conference proceedingConference contribution

87 Citations (Scopus)

Abstract

The objective of entity identification is to determine the correspondence between object instances from more than one database. This paper examines the problem at the instance level assuming that schema level heterogeneity has been resolved a priori. Soundness and completeness are defined as the desired properties of any entity identification technique. To achieve soundness, a set of identity and distinctness rules are established for entities in the integrated world. We propose the use of extended key, which is the union of keys (and possibly other attributes) from the relations to be matched, and its corresponding identity rule, to determine the equivalence between tuples from relations which may not share any common key. Instance level functional dependencies (ILFD), a form of semantic constraint information about the real-world entities, are used to derive the missing extended key attribute values of a tuple.

Original languageEnglish
Title of host publicationProceedings - International Conference on Data Engineering
Place of PublicationLos Alamitos, CA, United States
PublisherPubl by IEEE
Pages294-301
Number of pages8
ISBN (Print)0818635703
Publication statusPublished - 1993
Externally publishedYes
Event1993 IEEE 9th International Conference on Data Engineering - Vienna, Austria
Duration: 19 Apr 199323 Apr 1993

Other

Other1993 IEEE 9th International Conference on Data Engineering
CityVienna, Austria
Period19/4/9323/4/93

Fingerprint

Semantics

ASJC Scopus subject areas

  • Software
  • Engineering(all)
  • Engineering (miscellaneous)

Cite this

Lim, E. P., Srivastava, J., Prabhakar, S., & Richardson, J. (1993). Entity identification in database integration. In Proceedings - International Conference on Data Engineering (pp. 294-301). Los Alamitos, CA, United States: Publ by IEEE.

Entity identification in database integration. / Lim, Ee Peng; Srivastava, Jaideep; Prabhakar, Satya; Richardson, James.

Proceedings - International Conference on Data Engineering. Los Alamitos, CA, United States : Publ by IEEE, 1993. p. 294-301.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Lim, EP, Srivastava, J, Prabhakar, S & Richardson, J 1993, Entity identification in database integration. in Proceedings - International Conference on Data Engineering. Publ by IEEE, Los Alamitos, CA, United States, pp. 294-301, 1993 IEEE 9th International Conference on Data Engineering, Vienna, Austria, 19/4/93.
Lim EP, Srivastava J, Prabhakar S, Richardson J. Entity identification in database integration. In Proceedings - International Conference on Data Engineering. Los Alamitos, CA, United States: Publ by IEEE. 1993. p. 294-301
Lim, Ee Peng ; Srivastava, Jaideep ; Prabhakar, Satya ; Richardson, James. / Entity identification in database integration. Proceedings - International Conference on Data Engineering. Los Alamitos, CA, United States : Publ by IEEE, 1993. pp. 294-301
@inproceedings{d5d96c3a8dd348d0a5139f307e74b8c4,
title = "Entity identification in database integration",
abstract = "The objective of entity identification is to determine the correspondence between object instances from more than one database. This paper examines the problem at the instance level assuming that schema level heterogeneity has been resolved a priori. Soundness and completeness are defined as the desired properties of any entity identification technique. To achieve soundness, a set of identity and distinctness rules are established for entities in the integrated world. We propose the use of extended key, which is the union of keys (and possibly other attributes) from the relations to be matched, and its corresponding identity rule, to determine the equivalence between tuples from relations which may not share any common key. Instance level functional dependencies (ILFD), a form of semantic constraint information about the real-world entities, are used to derive the missing extended key attribute values of a tuple.",
author = "Lim, {Ee Peng} and Jaideep Srivastava and Satya Prabhakar and James Richardson",
year = "1993",
language = "English",
isbn = "0818635703",
pages = "294--301",
booktitle = "Proceedings - International Conference on Data Engineering",
publisher = "Publ by IEEE",

}

TY - GEN

T1 - Entity identification in database integration

AU - Lim, Ee Peng

AU - Srivastava, Jaideep

AU - Prabhakar, Satya

AU - Richardson, James

PY - 1993

Y1 - 1993

N2 - The objective of entity identification is to determine the correspondence between object instances from more than one database. This paper examines the problem at the instance level assuming that schema level heterogeneity has been resolved a priori. Soundness and completeness are defined as the desired properties of any entity identification technique. To achieve soundness, a set of identity and distinctness rules are established for entities in the integrated world. We propose the use of extended key, which is the union of keys (and possibly other attributes) from the relations to be matched, and its corresponding identity rule, to determine the equivalence between tuples from relations which may not share any common key. Instance level functional dependencies (ILFD), a form of semantic constraint information about the real-world entities, are used to derive the missing extended key attribute values of a tuple.

AB - The objective of entity identification is to determine the correspondence between object instances from more than one database. This paper examines the problem at the instance level assuming that schema level heterogeneity has been resolved a priori. Soundness and completeness are defined as the desired properties of any entity identification technique. To achieve soundness, a set of identity and distinctness rules are established for entities in the integrated world. We propose the use of extended key, which is the union of keys (and possibly other attributes) from the relations to be matched, and its corresponding identity rule, to determine the equivalence between tuples from relations which may not share any common key. Instance level functional dependencies (ILFD), a form of semantic constraint information about the real-world entities, are used to derive the missing extended key attribute values of a tuple.

UR - http://www.scopus.com/inward/record.url?scp=0027189241&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0027189241&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:0027189241

SN - 0818635703

SP - 294

EP - 301

BT - Proceedings - International Conference on Data Engineering

PB - Publ by IEEE

CY - Los Alamitos, CA, United States

ER -