Building XML statistics for the hidden web

Ashraf Aboulnaga, Jeffrey F. Naughton

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

There have been several techniques proposed for building statistics for static XML data. However, very little work has been done in the area of building XML statistics for data sources that export XML views of data that is stored in relational or other databases. For such data sources, we need statistics that are built in an on-line manner, by observing the XML queries to the data sources and their results. In this paper, we present a technique for building on-line XML statistics by observing the XPath queries issued to a data source and their result sizes. These XPath queries select parts of the virtual XML document representing the XML view of the data at the data source. We convert these XPath queries to a more abstract and generalized form that we call annotated path expressions. We present a technique for storing these annotated path expressions and information about their selectivity for use in estimating the selectivity of future XPath queries. We also present an experimental evaluation of our proposed approach.

Original languageEnglish
Title of host publicationInternational Conference on Information and Knowledge Management, Proceedings
EditorsO. Frieder, J. Hammer, S. Qureshi, L. Seligman
Pages358-365
Number of pages8
Publication statusPublished - 1 Dec 2003
Externally publishedYes
EventCIKM 2003: Proceedings of the Twelfth ACM International Conference on Information and Knowledge Management - New Orleans, LA, United States
Duration: 3 Nov 20038 Nov 2003

Other

OtherCIKM 2003: Proceedings of the Twelfth ACM International Conference on Information and Knowledge Management
CountryUnited States
CityNew Orleans, LA
Period3/11/038/11/03

    Fingerprint

Keywords

  • Database statistics
  • Hidden web
  • Query optimization
  • Selectivity estimation
  • XML

ASJC Scopus subject areas

  • Business, Management and Accounting(all)

Cite this

Aboulnaga, A., & Naughton, J. F. (2003). Building XML statistics for the hidden web. In O. Frieder, J. Hammer, S. Qureshi, & L. Seligman (Eds.), International Conference on Information and Knowledge Management, Proceedings (pp. 358-365)