Mining temporal patterns in popularity of web items

Woong Kee Loh, Sandeep Mane, Jaideep Srivastava

Research output: Contribution to journalArticle

8 Citations (Scopus)

Abstract

Huge amounts of various web items (e.g., images, keywords, and web pages) are being made available on the Web. The popularity of such web items continuously changes over time, and mining for temporal patterns in the popularity of web items is an important problem that is useful for several Web applications; for example, the temporal patterns in the popularity of web search keywords help web search enterprises predict future popular keywords, thus enabling them to make price decisions when marketing search keywords to advertisers. However, the presence of millions of web items makes it difficult to scale up previous techniques for this problem. This paper proposes an efficient method for mining temporal patterns in the popularity of web items. We treat the popularity of web items as time-series and propose a novel measure, a gap measure, to quantify the dissimilarity between the popularity of two web items. To reduce the computational overhead for this measure, an efficient method using the Discrete Fourier Transform (DFT) is presented. We assume that the popularity of web items is not necessarily periodic. For finding clusters of web items with similar popularity trends, we show the limitations of traditional clustering approaches and propose a scalable, efficient, density-based clustering algorithm using the gap measure. Our experiments using the popularity trends of web search keywords obtained from the Google Trends web site illustrate the scalability and usefulness of the proposed approach in real-world applications.

Original languageEnglish
Pages (from-to)5010-5028
Number of pages19
JournalInformation sciences
Volume181
Issue number22
DOIs
Publication statusPublished - 15 Nov 2011

    Fingerprint

Keywords

  • Density-based clustering
  • Gap measure
  • Popularity trends
  • Temporal patterns
  • Web items

ASJC Scopus subject areas

  • Software
  • Control and Systems Engineering
  • Theoretical Computer Science
  • Computer Science Applications
  • Information Systems and Management
  • Artificial Intelligence

Cite this