Fast structural join with a location function

Nan Tang, Jeffrey Xu Yu, Kam Fai Wong, Haifeng Jiang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

A structural join evaluates structural relationship (parentchild or ancestor-descendant) between XML elements. It serves as an important computation unit in XML pattern matching, such as twig joins. There exists many work on efficient structural joins. In particular, indexes can expedite structural joins by skipping unmatchable elements. A typical use of indexes is to retrieve, for a given element, all its ancestor (or descendant) elements from an indexed set. However we observed two possible limitations with such index probes, namely false hit and false locate. A false hit means that an index probe touches unnecessary data besides real results; a false locate stands for a (wasted) probe that has zero answers. Obviously false hit and false locate can affect negatively the efficiency of structural joins. In this paper, we challenge ourselves to develop new structural join algorithm with no false hit and no false locate. We illustrate that R-Tree has the no false hit property (in contrast to B+-Tree) and hence is a good candidate for our algorithm. For no false locate, we propose a new function called Location which tells the probing points that will result in matches. We design and implement the Location function using a space-efficient structure, and present our algorithm using R-Tree together with the Location function. Extensive experiments show the efficiency of our algorithm.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages777-786
Number of pages10
Volume3882 LNCS
DOIs
Publication statusPublished - 7 Jul 2006
Externally publishedYes
Event11th International Conference on Database Systems for Advanced Applications, DASFAA 2006 - Singapore, Singapore
Duration: 12 Apr 200615 Apr 2006

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3882 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other11th International Conference on Database Systems for Advanced Applications, DASFAA 2006
CountrySingapore
CitySingapore
Period12/4/0615/4/06

Fingerprint

Join
Hits
XML
Trees (mathematics)
Pattern matching
Touch
R-tree
Probe
False
B-tree
Pattern Matching
Experiments
Unit
Evaluate
Zero

ASJC Scopus subject areas

  • Computer Science(all)
  • Biochemistry, Genetics and Molecular Biology(all)
  • Theoretical Computer Science

Cite this

Tang, N., Yu, J. X., Wong, K. F., & Jiang, H. (2006). Fast structural join with a location function. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3882 LNCS, pp. 777-786). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 3882 LNCS). https://doi.org/10.1007/11733836_55

Fast structural join with a location function. / Tang, Nan; Yu, Jeffrey Xu; Wong, Kam Fai; Jiang, Haifeng.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 3882 LNCS 2006. p. 777-786 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 3882 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Tang, N, Yu, JX, Wong, KF & Jiang, H 2006, Fast structural join with a location function. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 3882 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 3882 LNCS, pp. 777-786, 11th International Conference on Database Systems for Advanced Applications, DASFAA 2006, Singapore, Singapore, 12/4/06. https://doi.org/10.1007/11733836_55
Tang N, Yu JX, Wong KF, Jiang H. Fast structural join with a location function. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 3882 LNCS. 2006. p. 777-786. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/11733836_55
Tang, Nan ; Yu, Jeffrey Xu ; Wong, Kam Fai ; Jiang, Haifeng. / Fast structural join with a location function. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 3882 LNCS 2006. pp. 777-786 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{8aa3433e77ab46d6ad51be488515f2c1,
title = "Fast structural join with a location function",
abstract = "A structural join evaluates structural relationship (parentchild or ancestor-descendant) between XML elements. It serves as an important computation unit in XML pattern matching, such as twig joins. There exists many work on efficient structural joins. In particular, indexes can expedite structural joins by skipping unmatchable elements. A typical use of indexes is to retrieve, for a given element, all its ancestor (or descendant) elements from an indexed set. However we observed two possible limitations with such index probes, namely false hit and false locate. A false hit means that an index probe touches unnecessary data besides real results; a false locate stands for a (wasted) probe that has zero answers. Obviously false hit and false locate can affect negatively the efficiency of structural joins. In this paper, we challenge ourselves to develop new structural join algorithm with no false hit and no false locate. We illustrate that R-Tree has the no false hit property (in contrast to B+-Tree) and hence is a good candidate for our algorithm. For no false locate, we propose a new function called Location which tells the probing points that will result in matches. We design and implement the Location function using a space-efficient structure, and present our algorithm using R-Tree together with the Location function. Extensive experiments show the efficiency of our algorithm.",
author = "Nan Tang and Yu, {Jeffrey Xu} and Wong, {Kam Fai} and Haifeng Jiang",
year = "2006",
month = "7",
day = "7",
doi = "10.1007/11733836_55",
language = "English",
isbn = "3540333371",
volume = "3882 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "777--786",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Fast structural join with a location function

AU - Tang, Nan

AU - Yu, Jeffrey Xu

AU - Wong, Kam Fai

AU - Jiang, Haifeng

PY - 2006/7/7

Y1 - 2006/7/7

N2 - A structural join evaluates structural relationship (parentchild or ancestor-descendant) between XML elements. It serves as an important computation unit in XML pattern matching, such as twig joins. There exists many work on efficient structural joins. In particular, indexes can expedite structural joins by skipping unmatchable elements. A typical use of indexes is to retrieve, for a given element, all its ancestor (or descendant) elements from an indexed set. However we observed two possible limitations with such index probes, namely false hit and false locate. A false hit means that an index probe touches unnecessary data besides real results; a false locate stands for a (wasted) probe that has zero answers. Obviously false hit and false locate can affect negatively the efficiency of structural joins. In this paper, we challenge ourselves to develop new structural join algorithm with no false hit and no false locate. We illustrate that R-Tree has the no false hit property (in contrast to B+-Tree) and hence is a good candidate for our algorithm. For no false locate, we propose a new function called Location which tells the probing points that will result in matches. We design and implement the Location function using a space-efficient structure, and present our algorithm using R-Tree together with the Location function. Extensive experiments show the efficiency of our algorithm.

AB - A structural join evaluates structural relationship (parentchild or ancestor-descendant) between XML elements. It serves as an important computation unit in XML pattern matching, such as twig joins. There exists many work on efficient structural joins. In particular, indexes can expedite structural joins by skipping unmatchable elements. A typical use of indexes is to retrieve, for a given element, all its ancestor (or descendant) elements from an indexed set. However we observed two possible limitations with such index probes, namely false hit and false locate. A false hit means that an index probe touches unnecessary data besides real results; a false locate stands for a (wasted) probe that has zero answers. Obviously false hit and false locate can affect negatively the efficiency of structural joins. In this paper, we challenge ourselves to develop new structural join algorithm with no false hit and no false locate. We illustrate that R-Tree has the no false hit property (in contrast to B+-Tree) and hence is a good candidate for our algorithm. For no false locate, we propose a new function called Location which tells the probing points that will result in matches. We design and implement the Location function using a space-efficient structure, and present our algorithm using R-Tree together with the Location function. Extensive experiments show the efficiency of our algorithm.

UR - http://www.scopus.com/inward/record.url?scp=33745552502&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33745552502&partnerID=8YFLogxK

U2 - 10.1007/11733836_55

DO - 10.1007/11733836_55

M3 - Conference contribution

SN - 3540333371

SN - 9783540333371

VL - 3882 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 777

EP - 786

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -