Exploring video content structure for hierarchical summarization

Xingquan Zhu, Xindong Wu, Jianping Fan, Ahmed K. Elmagarmid, Walid G. Aref

Research output: Contribution to journalArticle

65 Citations (Scopus)


In this paper, we propose a hierarchical video summarization strategy that explores video content structure to provide the users with a scalable, multilevel video summary. First, video-shot- segmentation and keyframe-extraction algorithms are applied to parse video sequences into physical shots and discrete keyframes. Next, an affinity (self-correlation) matrix is constructed to merge visually similar shots into clusters (supergroups). Since video shots with high similarities do not necessarily imply that they belong to the same story unit, temporal information is adopted by merging temporally adjacent shots (within a specified distance) from the super-group into each video group. A video-scene-detection algorithm is thus proposed to merge temporally or spatially correlated video groups into scenario units. This is followed by a scene-clustering algorithm that eliminates visual redundancy among the units. A hierarchical video content structure with increasing granularity is constructed from the clustered scenes, video scenes, and video groups to keyframes. Finally, we introduce a hierarchical video summarization scheme by executing various approaches at different levels of the video content hierarchy to statically or dynamically construct the video summary. Extensive experiments based on real-world videos have been performed to validate the effectiveness of the proposed approach.

Original languageEnglish
Pages (from-to)98-115
Number of pages18
JournalMultimedia Systems
Issue number2
Publication statusPublished - 1 Aug 2004



  • Hierarchical clustering
  • Hierarchical video summarization
  • Video content hierarchy
  • Video group detection
  • Video scene detection

ASJC Scopus subject areas

  • Software
  • Information Systems
  • Media Technology
  • Hardware and Architecture
  • Computer Networks and Communications

Cite this