Fast XML structural join algorithms by partitioning

Nan Tang, Jeffrey Xu Yu, Kam Fai Wong, Jianxin Li

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

An XML structural join evaluates structural relationships (e.g. parent-child or ancestor-descendant) between XML elements. It serves as an important computation unit in XML pattern matching. Several classical structural join algorithms have been proposed such as Stack-tree join and XR-Tree join. In this paper, we consider to answer the problem of structural join by partitioning. The Dietz numbering scheme is used for encoding since nodes with the Dietz encodings could be well distributed on a plane. We first extend the relationships between nodes to the relationships between partitions on a plane and obtain some observations and properties about the relationships between partitions. We then propose a new partition-based method, named P-Join for structural join between ancestor and descendant nodes based on the properties derived from our observations. Moreover, we present an enhanced partitioned-based structural join algorithm and two optimized methods. Extensive experiments show that the performance of our proposed algorithms outperform that of Stack-tree and XR-Tree algorithms. In order to store the partitioning results, we design a simple but efficient index structure, called PSS-tree. The experimental result shows that it has less maintenance overhead than XR-Tree.

Original languageEnglish
Pages (from-to)33-53
Number of pages21
JournalJournal of Research and Practice in Information Technology
Volume40
Issue number1
Publication statusPublished - 31 Mar 2008
Externally publishedYes

Fingerprint

XML
Trees (mathematics)
Pattern matching
Partitioning
Join
Experiments
Node

Keywords

  • Partition
  • Structural join
  • XML

ASJC Scopus subject areas

  • Information Systems
  • Computer Graphics and Computer-Aided Design
  • Software

Cite this

Fast XML structural join algorithms by partitioning. / Tang, Nan; Yu, Jeffrey Xu; Wong, Kam Fai; Li, Jianxin.

In: Journal of Research and Practice in Information Technology, Vol. 40, No. 1, 31.03.2008, p. 33-53.

Research output: Contribution to journalArticle

Tang, Nan ; Yu, Jeffrey Xu ; Wong, Kam Fai ; Li, Jianxin. / Fast XML structural join algorithms by partitioning. In: Journal of Research and Practice in Information Technology. 2008 ; Vol. 40, No. 1. pp. 33-53.
@article{4e480809a7404b4e85e0676207143f89,
title = "Fast XML structural join algorithms by partitioning",
abstract = "An XML structural join evaluates structural relationships (e.g. parent-child or ancestor-descendant) between XML elements. It serves as an important computation unit in XML pattern matching. Several classical structural join algorithms have been proposed such as Stack-tree join and XR-Tree join. In this paper, we consider to answer the problem of structural join by partitioning. The Dietz numbering scheme is used for encoding since nodes with the Dietz encodings could be well distributed on a plane. We first extend the relationships between nodes to the relationships between partitions on a plane and obtain some observations and properties about the relationships between partitions. We then propose a new partition-based method, named P-Join for structural join between ancestor and descendant nodes based on the properties derived from our observations. Moreover, we present an enhanced partitioned-based structural join algorithm and two optimized methods. Extensive experiments show that the performance of our proposed algorithms outperform that of Stack-tree and XR-Tree algorithms. In order to store the partitioning results, we design a simple but efficient index structure, called PSS-tree. The experimental result shows that it has less maintenance overhead than XR-Tree.",
keywords = "Partition, Structural join, XML",
author = "Nan Tang and Yu, {Jeffrey Xu} and Wong, {Kam Fai} and Jianxin Li",
year = "2008",
month = "3",
day = "31",
language = "English",
volume = "40",
pages = "33--53",
journal = "Journal of Research and Practice in Information Technology",
issn = "1443-458X",
publisher = "Australian Computer Society",
number = "1",

}

TY - JOUR

T1 - Fast XML structural join algorithms by partitioning

AU - Tang, Nan

AU - Yu, Jeffrey Xu

AU - Wong, Kam Fai

AU - Li, Jianxin

PY - 2008/3/31

Y1 - 2008/3/31

N2 - An XML structural join evaluates structural relationships (e.g. parent-child or ancestor-descendant) between XML elements. It serves as an important computation unit in XML pattern matching. Several classical structural join algorithms have been proposed such as Stack-tree join and XR-Tree join. In this paper, we consider to answer the problem of structural join by partitioning. The Dietz numbering scheme is used for encoding since nodes with the Dietz encodings could be well distributed on a plane. We first extend the relationships between nodes to the relationships between partitions on a plane and obtain some observations and properties about the relationships between partitions. We then propose a new partition-based method, named P-Join for structural join between ancestor and descendant nodes based on the properties derived from our observations. Moreover, we present an enhanced partitioned-based structural join algorithm and two optimized methods. Extensive experiments show that the performance of our proposed algorithms outperform that of Stack-tree and XR-Tree algorithms. In order to store the partitioning results, we design a simple but efficient index structure, called PSS-tree. The experimental result shows that it has less maintenance overhead than XR-Tree.

AB - An XML structural join evaluates structural relationships (e.g. parent-child or ancestor-descendant) between XML elements. It serves as an important computation unit in XML pattern matching. Several classical structural join algorithms have been proposed such as Stack-tree join and XR-Tree join. In this paper, we consider to answer the problem of structural join by partitioning. The Dietz numbering scheme is used for encoding since nodes with the Dietz encodings could be well distributed on a plane. We first extend the relationships between nodes to the relationships between partitions on a plane and obtain some observations and properties about the relationships between partitions. We then propose a new partition-based method, named P-Join for structural join between ancestor and descendant nodes based on the properties derived from our observations. Moreover, we present an enhanced partitioned-based structural join algorithm and two optimized methods. Extensive experiments show that the performance of our proposed algorithms outperform that of Stack-tree and XR-Tree algorithms. In order to store the partitioning results, we design a simple but efficient index structure, called PSS-tree. The experimental result shows that it has less maintenance overhead than XR-Tree.

KW - Partition

KW - Structural join

KW - XML

UR - http://www.scopus.com/inward/record.url?scp=41149162851&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=41149162851&partnerID=8YFLogxK

M3 - Article

VL - 40

SP - 33

EP - 53

JO - Journal of Research and Practice in Information Technology

JF - Journal of Research and Practice in Information Technology

SN - 1443-458X

IS - 1

ER -