Dr. Searcher and Mr. Browser: A unified hyperlink-click graph

Barbara Poblete, Carlos Castillo, Aristides Gionis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

15 Citations (Scopus)

Abstract

We introduce a unified graph representation of the Web, which includes both structural and usage information. We model this graph using a simple union of the Web's hyperlink and click graphs. The hyperlink graph expresses link structure among Web pages, while the click graph is a bipartite graph of queries and documents denoting users' searching behavior extracted from a search engine's query log. Our most important motivation is to model in a unified way the two main activities of users on the Web: searching and browsing, and at the same time to analyze the effects of random walks on this new graph. The intuition behind this task is to measure how the combination of link structure and usage data provide additional information to that contained in these structures independently. Our experimental results show that both hyperlink and click graphs have strengths and weaknesses when it comes to using their stationary distribution scores for ranking Web pages. Furthermore, our evaluation indicates that the unified graph always generates consistent and robust scores that follow closely the best result obtained from either individual graph, even when applied to "noisy" data. It is our belief that the unified Web graph has several useful properties for improving current Web document ranking, as well as for generating new rankings of its own. In particular stationary distribution scores derived from the random walks on the combined graph can be used as an indicator of whether structural or usage data are more reliable in different situations.

Original languageEnglish
Title of host publicationInternational Conference on Information and Knowledge Management, Proceedings
Pages1123-1132
Number of pages10
DOIs
Publication statusPublished - 1 Dec 2008
Externally publishedYes
Event17th ACM Conference on Information and Knowledge Management, CIKM'08 - Napa Valley, CA, United States
Duration: 26 Oct 200830 Oct 2008

Other

Other17th ACM Conference on Information and Knowledge Management, CIKM'08
CountryUnited States
CityNapa Valley, CA
Period26/10/0830/10/08

Fingerprint

Graph
World Wide Web
Ranking
Stationary distribution
Random walk
Bipartite graph
Search engine
Evaluation
Query logs
Intuition
Query
Graph model

Keywords

  • Algorithms
  • Experimentation
  • Human factors

ASJC Scopus subject areas

  • Business, Management and Accounting(all)
  • Decision Sciences(all)

Cite this

Poblete, B., Castillo, C., & Gionis, A. (2008). Dr. Searcher and Mr. Browser: A unified hyperlink-click graph. In International Conference on Information and Knowledge Management, Proceedings (pp. 1123-1132) https://doi.org/10.1145/1458082.1458231

Dr. Searcher and Mr. Browser : A unified hyperlink-click graph. / Poblete, Barbara; Castillo, Carlos; Gionis, Aristides.

International Conference on Information and Knowledge Management, Proceedings. 2008. p. 1123-1132.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Poblete, B, Castillo, C & Gionis, A 2008, Dr. Searcher and Mr. Browser: A unified hyperlink-click graph. in International Conference on Information and Knowledge Management, Proceedings. pp. 1123-1132, 17th ACM Conference on Information and Knowledge Management, CIKM'08, Napa Valley, CA, United States, 26/10/08. https://doi.org/10.1145/1458082.1458231
Poblete B, Castillo C, Gionis A. Dr. Searcher and Mr. Browser: A unified hyperlink-click graph. In International Conference on Information and Knowledge Management, Proceedings. 2008. p. 1123-1132 https://doi.org/10.1145/1458082.1458231
Poblete, Barbara ; Castillo, Carlos ; Gionis, Aristides. / Dr. Searcher and Mr. Browser : A unified hyperlink-click graph. International Conference on Information and Knowledge Management, Proceedings. 2008. pp. 1123-1132
@inproceedings{e44258e8106c41e4b8f1062db98283f1,
title = "Dr. Searcher and Mr. Browser: A unified hyperlink-click graph",
abstract = "We introduce a unified graph representation of the Web, which includes both structural and usage information. We model this graph using a simple union of the Web's hyperlink and click graphs. The hyperlink graph expresses link structure among Web pages, while the click graph is a bipartite graph of queries and documents denoting users' searching behavior extracted from a search engine's query log. Our most important motivation is to model in a unified way the two main activities of users on the Web: searching and browsing, and at the same time to analyze the effects of random walks on this new graph. The intuition behind this task is to measure how the combination of link structure and usage data provide additional information to that contained in these structures independently. Our experimental results show that both hyperlink and click graphs have strengths and weaknesses when it comes to using their stationary distribution scores for ranking Web pages. Furthermore, our evaluation indicates that the unified graph always generates consistent and robust scores that follow closely the best result obtained from either individual graph, even when applied to {"}noisy{"} data. It is our belief that the unified Web graph has several useful properties for improving current Web document ranking, as well as for generating new rankings of its own. In particular stationary distribution scores derived from the random walks on the combined graph can be used as an indicator of whether structural or usage data are more reliable in different situations.",
keywords = "Algorithms, Experimentation, Human factors",
author = "Barbara Poblete and Carlos Castillo and Aristides Gionis",
year = "2008",
month = "12",
day = "1",
doi = "10.1145/1458082.1458231",
language = "English",
isbn = "9781595939913",
pages = "1123--1132",
booktitle = "International Conference on Information and Knowledge Management, Proceedings",

}

TY - GEN

T1 - Dr. Searcher and Mr. Browser

T2 - A unified hyperlink-click graph

AU - Poblete, Barbara

AU - Castillo, Carlos

AU - Gionis, Aristides

PY - 2008/12/1

Y1 - 2008/12/1

N2 - We introduce a unified graph representation of the Web, which includes both structural and usage information. We model this graph using a simple union of the Web's hyperlink and click graphs. The hyperlink graph expresses link structure among Web pages, while the click graph is a bipartite graph of queries and documents denoting users' searching behavior extracted from a search engine's query log. Our most important motivation is to model in a unified way the two main activities of users on the Web: searching and browsing, and at the same time to analyze the effects of random walks on this new graph. The intuition behind this task is to measure how the combination of link structure and usage data provide additional information to that contained in these structures independently. Our experimental results show that both hyperlink and click graphs have strengths and weaknesses when it comes to using their stationary distribution scores for ranking Web pages. Furthermore, our evaluation indicates that the unified graph always generates consistent and robust scores that follow closely the best result obtained from either individual graph, even when applied to "noisy" data. It is our belief that the unified Web graph has several useful properties for improving current Web document ranking, as well as for generating new rankings of its own. In particular stationary distribution scores derived from the random walks on the combined graph can be used as an indicator of whether structural or usage data are more reliable in different situations.

AB - We introduce a unified graph representation of the Web, which includes both structural and usage information. We model this graph using a simple union of the Web's hyperlink and click graphs. The hyperlink graph expresses link structure among Web pages, while the click graph is a bipartite graph of queries and documents denoting users' searching behavior extracted from a search engine's query log. Our most important motivation is to model in a unified way the two main activities of users on the Web: searching and browsing, and at the same time to analyze the effects of random walks on this new graph. The intuition behind this task is to measure how the combination of link structure and usage data provide additional information to that contained in these structures independently. Our experimental results show that both hyperlink and click graphs have strengths and weaknesses when it comes to using their stationary distribution scores for ranking Web pages. Furthermore, our evaluation indicates that the unified graph always generates consistent and robust scores that follow closely the best result obtained from either individual graph, even when applied to "noisy" data. It is our belief that the unified Web graph has several useful properties for improving current Web document ranking, as well as for generating new rankings of its own. In particular stationary distribution scores derived from the random walks on the combined graph can be used as an indicator of whether structural or usage data are more reliable in different situations.

KW - Algorithms

KW - Experimentation

KW - Human factors

UR - http://www.scopus.com/inward/record.url?scp=70349234516&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70349234516&partnerID=8YFLogxK

U2 - 10.1145/1458082.1458231

DO - 10.1145/1458082.1458231

M3 - Conference contribution

AN - SCOPUS:70349234516

SN - 9781595939913

SP - 1123

EP - 1132

BT - International Conference on Information and Knowledge Management, Proceedings

ER -