FIX

Feature-based indexing technique for XML documents

Ning Zhang, M. Tamer Özsu, Ihab F. Ilyas, Ashraf Aboulnaga

Research output: Chapter in Book/Report/Conference proceedingConference contribution

26 Citations (Scopus)

Abstract

Indexing large XML databases is crucial for efficient evaluation of XML twig queries. In this paper, we propose a feature-based indexing technique, called FIX, based on spectral graph theory. The basic idea is that for each twig pattern in a collection of XML documents, we calculate a vector of features based on its structural properties. These features are used as keys for the patterns and stored in a B+ tree. Given an XPath query, its feature vector is first calculated and looked up in the index. Then a further refinement phase is performed to fetch the final results. We experimentally study the indexing technique over both synthetic and real data sets. Our experiments show that FIX provides great pruning power and could gain an order of magnitude performance improvement for many XPath queries over existing evaluation techniques.

Original languageEnglish
Title of host publicationVLDB 2006 - Proceedings of the 32nd International Conference on Very Large Data Bases
Pages259-270
Number of pages12
Publication statusPublished - 1 Dec 2006
Externally publishedYes
Event32nd International Conference on Very Large Data Bases, VLDB 2006 - Seoul, Korea, Republic of
Duration: 12 Sep 200615 Sep 2006

Other

Other32nd International Conference on Very Large Data Bases, VLDB 2006
CountryKorea, Republic of
CitySeoul
Period12/9/0615/9/06

Fingerprint

XML
Graph theory
Structural properties
Indexing
Query
Experiments
Evaluation
XPath
Performance improvement
Pruning
Experiment
Data base

ASJC Scopus subject areas

  • Hardware and Architecture
  • Information Systems
  • Software
  • Information Systems and Management

Cite this

Zhang, N., Özsu, M. T., Ilyas, I. F., & Aboulnaga, A. (2006). FIX: Feature-based indexing technique for XML documents. In VLDB 2006 - Proceedings of the 32nd International Conference on Very Large Data Bases (pp. 259-270)

FIX : Feature-based indexing technique for XML documents. / Zhang, Ning; Özsu, M. Tamer; Ilyas, Ihab F.; Aboulnaga, Ashraf.

VLDB 2006 - Proceedings of the 32nd International Conference on Very Large Data Bases. 2006. p. 259-270.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Zhang, N, Özsu, MT, Ilyas, IF & Aboulnaga, A 2006, FIX: Feature-based indexing technique for XML documents. in VLDB 2006 - Proceedings of the 32nd International Conference on Very Large Data Bases. pp. 259-270, 32nd International Conference on Very Large Data Bases, VLDB 2006, Seoul, Korea, Republic of, 12/9/06.
Zhang N, Özsu MT, Ilyas IF, Aboulnaga A. FIX: Feature-based indexing technique for XML documents. In VLDB 2006 - Proceedings of the 32nd International Conference on Very Large Data Bases. 2006. p. 259-270
Zhang, Ning ; Özsu, M. Tamer ; Ilyas, Ihab F. ; Aboulnaga, Ashraf. / FIX : Feature-based indexing technique for XML documents. VLDB 2006 - Proceedings of the 32nd International Conference on Very Large Data Bases. 2006. pp. 259-270
@inproceedings{caf80a66171440d892f7be95c38caac4,
title = "FIX: Feature-based indexing technique for XML documents",
abstract = "Indexing large XML databases is crucial for efficient evaluation of XML twig queries. In this paper, we propose a feature-based indexing technique, called FIX, based on spectral graph theory. The basic idea is that for each twig pattern in a collection of XML documents, we calculate a vector of features based on its structural properties. These features are used as keys for the patterns and stored in a B+ tree. Given an XPath query, its feature vector is first calculated and looked up in the index. Then a further refinement phase is performed to fetch the final results. We experimentally study the indexing technique over both synthetic and real data sets. Our experiments show that FIX provides great pruning power and could gain an order of magnitude performance improvement for many XPath queries over existing evaluation techniques.",
author = "Ning Zhang and {\"O}zsu, {M. Tamer} and Ilyas, {Ihab F.} and Ashraf Aboulnaga",
year = "2006",
month = "12",
day = "1",
language = "English",
isbn = "1595933859",
pages = "259--270",
booktitle = "VLDB 2006 - Proceedings of the 32nd International Conference on Very Large Data Bases",

}

TY - GEN

T1 - FIX

T2 - Feature-based indexing technique for XML documents

AU - Zhang, Ning

AU - Özsu, M. Tamer

AU - Ilyas, Ihab F.

AU - Aboulnaga, Ashraf

PY - 2006/12/1

Y1 - 2006/12/1

N2 - Indexing large XML databases is crucial for efficient evaluation of XML twig queries. In this paper, we propose a feature-based indexing technique, called FIX, based on spectral graph theory. The basic idea is that for each twig pattern in a collection of XML documents, we calculate a vector of features based on its structural properties. These features are used as keys for the patterns and stored in a B+ tree. Given an XPath query, its feature vector is first calculated and looked up in the index. Then a further refinement phase is performed to fetch the final results. We experimentally study the indexing technique over both synthetic and real data sets. Our experiments show that FIX provides great pruning power and could gain an order of magnitude performance improvement for many XPath queries over existing evaluation techniques.

AB - Indexing large XML databases is crucial for efficient evaluation of XML twig queries. In this paper, we propose a feature-based indexing technique, called FIX, based on spectral graph theory. The basic idea is that for each twig pattern in a collection of XML documents, we calculate a vector of features based on its structural properties. These features are used as keys for the patterns and stored in a B+ tree. Given an XPath query, its feature vector is first calculated and looked up in the index. Then a further refinement phase is performed to fetch the final results. We experimentally study the indexing technique over both synthetic and real data sets. Our experiments show that FIX provides great pruning power and could gain an order of magnitude performance improvement for many XPath queries over existing evaluation techniques.

UR - http://www.scopus.com/inward/record.url?scp=35448931050&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=35448931050&partnerID=8YFLogxK

M3 - Conference contribution

SN - 1595933859

SN - 9781595933850

SP - 259

EP - 270

BT - VLDB 2006 - Proceedings of the 32nd International Conference on Very Large Data Bases

ER -