Entity identification in database integration

Ee Peng Lim, Jaideep Srivastava, Satya Prabhakar, James Richardson

Research output: Chapter in Book/Report/Conference proceedingConference contribution

89 Citations (Scopus)

Abstract

The objective of entity identification is to determine the correspondence between object instances from more than one database. This paper examines the problem at the instance level assuming that schema level heterogeneity has been resolved a priori. Soundness and completeness are defined as the desired properties of any entity identification technique. To achieve soundness, a set of identity and distinctness rules are established for entities in the integrated world. We propose the use of extended key, which is the union of keys (and possibly other attributes) from the relations to be matched, and its corresponding identity rule, to determine the equivalence between tuples from relations which may not share any common key. Instance level functional dependencies (ILFD), a form of semantic constraint information about the real-world entities, are used to derive the missing extended key attribute values of a tuple.

Original languageEnglish
Title of host publication1993 IEEE 9th International Conference on Data Engineering
PublisherPubl by IEEE
Pages294-301
Number of pages8
ISBN (Print)0818635703
Publication statusPublished - 1 Jan 1993
Event1993 IEEE 9th International Conference on Data Engineering - Vienna, Austria
Duration: 19 Apr 199323 Apr 1993

Publication series

NameProceedings - International Conference on Data Engineering

Other

Other1993 IEEE 9th International Conference on Data Engineering
CityVienna, Austria
Period19/4/9323/4/93

    Fingerprint

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Information Systems

Cite this

Lim, E. P., Srivastava, J., Prabhakar, S., & Richardson, J. (1993). Entity identification in database integration. In 1993 IEEE 9th International Conference on Data Engineering (pp. 294-301). (Proceedings - International Conference on Data Engineering). Publ by IEEE.