Exploring video content structure for hierarchical summarization

Xingquan Zhu, Xindong Wu, Jianping Fan, Ahmed Elmagarmid, Walid G. Aref

Research output: Contribution to journalArticle

62 Citations (Scopus)

Abstract

In this paper, we propose a hierarchical video summarization strategy that explores video content structure to provide the users with a scalable, multilevel video summary. First, video-shot- segmentation and keyframe-extraction algorithms are applied to parse video sequences into physical shots and discrete keyframes. Next, an affinity (self-correlation) matrix is constructed to merge visually similar shots into clusters (supergroups). Since video shots with high similarities do not necessarily imply that they belong to the same story unit, temporal information is adopted by merging temporally adjacent shots (within a specified distance) from the super-group into each video group. A video-scene-detection algorithm is thus proposed to merge temporally or spatially correlated video groups into scenario units. This is followed by a scene-clustering algorithm that eliminates visual redundancy among the units. A hierarchical video content structure with increasing granularity is constructed from the clustered scenes, video scenes, and video groups to keyframes. Finally, we introduce a hierarchical video summarization scheme by executing various approaches at different levels of the video content hierarchy to statically or dynamically construct the video summary. Extensive experiments based on real-world videos have been performed to validate the effectiveness of the proposed approach.

Original languageEnglish
Pages (from-to)98-115
Number of pages18
JournalMultimedia Systems
Volume10
Issue number2
DOIs
Publication statusPublished - 1 Aug 2004
Externally publishedYes

Fingerprint

Summarization
Video Summarization
Merging
Clustering algorithms
Unit
Redundancy
Correlation Matrix
Granularity
Affine transformation
Clustering Algorithm
Eliminate
Segmentation
Adjacent
Experiments
Imply
Scenarios
Experiment

Keywords

  • Hierarchical clustering
  • Hierarchical video summarization
  • Video content hierarchy
  • Video group detection
  • Video scene detection

ASJC Scopus subject areas

  • Information Systems
  • Theoretical Computer Science
  • Computational Theory and Mathematics

Cite this

Exploring video content structure for hierarchical summarization. / Zhu, Xingquan; Wu, Xindong; Fan, Jianping; Elmagarmid, Ahmed; Aref, Walid G.

In: Multimedia Systems, Vol. 10, No. 2, 01.08.2004, p. 98-115.

Research output: Contribution to journalArticle

Zhu, Xingquan ; Wu, Xindong ; Fan, Jianping ; Elmagarmid, Ahmed ; Aref, Walid G. / Exploring video content structure for hierarchical summarization. In: Multimedia Systems. 2004 ; Vol. 10, No. 2. pp. 98-115.
@article{c3c31bfd729f4821b6ceedb105cfae6e,
title = "Exploring video content structure for hierarchical summarization",
abstract = "In this paper, we propose a hierarchical video summarization strategy that explores video content structure to provide the users with a scalable, multilevel video summary. First, video-shot- segmentation and keyframe-extraction algorithms are applied to parse video sequences into physical shots and discrete keyframes. Next, an affinity (self-correlation) matrix is constructed to merge visually similar shots into clusters (supergroups). Since video shots with high similarities do not necessarily imply that they belong to the same story unit, temporal information is adopted by merging temporally adjacent shots (within a specified distance) from the super-group into each video group. A video-scene-detection algorithm is thus proposed to merge temporally or spatially correlated video groups into scenario units. This is followed by a scene-clustering algorithm that eliminates visual redundancy among the units. A hierarchical video content structure with increasing granularity is constructed from the clustered scenes, video scenes, and video groups to keyframes. Finally, we introduce a hierarchical video summarization scheme by executing various approaches at different levels of the video content hierarchy to statically or dynamically construct the video summary. Extensive experiments based on real-world videos have been performed to validate the effectiveness of the proposed approach.",
keywords = "Hierarchical clustering, Hierarchical video summarization, Video content hierarchy, Video group detection, Video scene detection",
author = "Xingquan Zhu and Xindong Wu and Jianping Fan and Ahmed Elmagarmid and Aref, {Walid G.}",
year = "2004",
month = "8",
day = "1",
doi = "10.1007/s00530-004-0142-7",
language = "English",
volume = "10",
pages = "98--115",
journal = "Multimedia Systems",
issn = "0942-4962",
publisher = "Springer Verlag",
number = "2",

}

TY - JOUR

T1 - Exploring video content structure for hierarchical summarization

AU - Zhu, Xingquan

AU - Wu, Xindong

AU - Fan, Jianping

AU - Elmagarmid, Ahmed

AU - Aref, Walid G.

PY - 2004/8/1

Y1 - 2004/8/1

N2 - In this paper, we propose a hierarchical video summarization strategy that explores video content structure to provide the users with a scalable, multilevel video summary. First, video-shot- segmentation and keyframe-extraction algorithms are applied to parse video sequences into physical shots and discrete keyframes. Next, an affinity (self-correlation) matrix is constructed to merge visually similar shots into clusters (supergroups). Since video shots with high similarities do not necessarily imply that they belong to the same story unit, temporal information is adopted by merging temporally adjacent shots (within a specified distance) from the super-group into each video group. A video-scene-detection algorithm is thus proposed to merge temporally or spatially correlated video groups into scenario units. This is followed by a scene-clustering algorithm that eliminates visual redundancy among the units. A hierarchical video content structure with increasing granularity is constructed from the clustered scenes, video scenes, and video groups to keyframes. Finally, we introduce a hierarchical video summarization scheme by executing various approaches at different levels of the video content hierarchy to statically or dynamically construct the video summary. Extensive experiments based on real-world videos have been performed to validate the effectiveness of the proposed approach.

AB - In this paper, we propose a hierarchical video summarization strategy that explores video content structure to provide the users with a scalable, multilevel video summary. First, video-shot- segmentation and keyframe-extraction algorithms are applied to parse video sequences into physical shots and discrete keyframes. Next, an affinity (self-correlation) matrix is constructed to merge visually similar shots into clusters (supergroups). Since video shots with high similarities do not necessarily imply that they belong to the same story unit, temporal information is adopted by merging temporally adjacent shots (within a specified distance) from the super-group into each video group. A video-scene-detection algorithm is thus proposed to merge temporally or spatially correlated video groups into scenario units. This is followed by a scene-clustering algorithm that eliminates visual redundancy among the units. A hierarchical video content structure with increasing granularity is constructed from the clustered scenes, video scenes, and video groups to keyframes. Finally, we introduce a hierarchical video summarization scheme by executing various approaches at different levels of the video content hierarchy to statically or dynamically construct the video summary. Extensive experiments based on real-world videos have been performed to validate the effectiveness of the proposed approach.

KW - Hierarchical clustering

KW - Hierarchical video summarization

KW - Video content hierarchy

KW - Video group detection

KW - Video scene detection

UR - http://www.scopus.com/inward/record.url?scp=10644262225&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=10644262225&partnerID=8YFLogxK

U2 - 10.1007/s00530-004-0142-7

DO - 10.1007/s00530-004-0142-7

M3 - Article

VL - 10

SP - 98

EP - 115

JO - Multimedia Systems

JF - Multimedia Systems

SN - 0942-4962

IS - 2

ER -