An algebraic transformation framework for multidatabase queries

Ee Peng Lim, Jaideep Srivastava, San Yih Hwang

Research output: Contribution to journalArticle

11 Citations (Scopus)

Abstract

Existence of semantic conflicts between component databases severely impacts query processing in a multidatabase system. In this paper, we describe two types of semantic conflicts that have to be dealt with in the integration of databases modeling information about related sets of real-world entities. These are the entity identification problem and the attribute value conflict problem. While the two-way outerjoin operation has been commonly used for resolving entity identification problem between two component relations, outerjoins using regular equality comparisons between component relation keys is shown to produce counter-intuitive entity identification result. We remedy this by defining a new key-equality comparator in place of regular equality comparator, for outerjoins. For the attribute value conflict problem, we define a Generalized Attribute Derivation (GAD) operation which allows user-defined attribute derivation functions to be used to compute new attributes from the component relations' attributes. By adding two-way outerjoin and GAD to the set of relational operations, the traditional algebraic transformation framework for relational queries is no longer adequate for multidatabase query processing and optimization. As a result, we introduce constrained query tree as the multidatabase query representation. We show that some knowledge about query predicates and attribute derivation functions can be used to simplify queries. Such knowledge is modeled as an outerjoin graph attached to every outerjoin operation in the query tree. Based on this, we further extend the traditional algebraic transformation framework to include two-way outerjoins and GAD operations. Our framework demonstrates that properties of selection/join predicates and attribute derivation functions can be used to provide interesting transformation alternatives. This framework also serves as a formal ground for developing optimization strategies for multidatabase queries.

Original languageEnglish
Pages (from-to)273-307
Number of pages35
JournalDistributed and Parallel Databases
Volume3
Issue number3
DOIs
Publication statusPublished - 1 Jul 1995

    Fingerprint

Keywords

  • algebraic transformation
  • constrained query tree
  • integration operation
  • multidatabase query
  • outerjoin graph

ASJC Scopus subject areas

  • Software
  • Information Systems
  • Hardware and Architecture
  • Information Systems and Management

Cite this