Dans fi̇gürleri̇ni̇n i̇şi̇tsel-görsel anali̇zi̇ i̇çi̇n i̇şi̇tsel özni̇teli̇kleri̇n deǧerlendi̇ri̇lmesi̇

Y. Demir, Ferda Ofli, E. Erzin, Y. Yemez, Ve A M Tekalp

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present a framework for selecting best audio features for audiovisual analysis and synthesis of dance figures. Dance figures are performed synchronously with the musical rhythm. They can be analyzed through the audio spectra using spectral and rhythmic musical features. In the proposed audio feature evaluation system, dance figures are manually labeled over the video stream. The music segments, which correspond to labeled dance figures, are used to train hidden Markov model (HMM) structures to learn temporal spectrum patterns for the dance figures. The dance figure recognition performances of the HMM models for various spectral feature sets are evaluated. Audio features, which are maximizing dance figure recognition performances, are selected as the best audio features for the analyzed audiovisual dance recordings. In our evaluations, mel-scale cepstral coefficients (MFCC) with their first and second derivatives, spectral centroid, spectral flux and spectral roll-off are used as candidate audio features. Selection of the best audio features can be used towards analysis and synthesis of audio-driven body animation.

Original languageUndefined/Unknown
Title of host publication2008 IEEE 16th Signal Processing, Communication and Applications Conference, SIU
DOIs
Publication statusPublished - 26 Nov 2008
Externally publishedYes
Event2008 IEEE 16th Signal Processing, Communication and Applications Conference, SIU - Aydin, Turkey
Duration: 20 Apr 200822 Apr 2008

Other

Other
CountryTurkey
CityAydin
Period20/4/0822/4/08

Keywords

  • Audio-driven body animation
  • Audio-visual analysis

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering
  • Communication

Cite this

Demir, Y., Ofli, F., Erzin, E., Yemez, Y., & Tekalp, V. A. M. (2008). Dans fi̇gürleri̇ni̇n i̇şi̇tsel-görsel anali̇zi̇ i̇çi̇n i̇şi̇tsel özni̇teli̇kleri̇n deǧerlendi̇ri̇lmesi̇. In 2008 IEEE 16th Signal Processing, Communication and Applications Conference, SIU [4632707] https://doi.org/10.1109/SIU.2008.4632707

Dans fi̇gürleri̇ni̇n i̇şi̇tsel-görsel anali̇zi̇ i̇çi̇n i̇şi̇tsel özni̇teli̇kleri̇n deǧerlendi̇ri̇lmesi̇. / Demir, Y.; Ofli, Ferda; Erzin, E.; Yemez, Y.; Tekalp, Ve A M.

2008 IEEE 16th Signal Processing, Communication and Applications Conference, SIU. 2008. 4632707.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Demir, Y, Ofli, F, Erzin, E, Yemez, Y & Tekalp, VAM 2008, Dans fi̇gürleri̇ni̇n i̇şi̇tsel-görsel anali̇zi̇ i̇çi̇n i̇şi̇tsel özni̇teli̇kleri̇n deǧerlendi̇ri̇lmesi̇. in 2008 IEEE 16th Signal Processing, Communication and Applications Conference, SIU., 4632707, Aydin, Turkey, 20/4/08. https://doi.org/10.1109/SIU.2008.4632707
Demir Y, Ofli F, Erzin E, Yemez Y, Tekalp VAM. Dans fi̇gürleri̇ni̇n i̇şi̇tsel-görsel anali̇zi̇ i̇çi̇n i̇şi̇tsel özni̇teli̇kleri̇n deǧerlendi̇ri̇lmesi̇. In 2008 IEEE 16th Signal Processing, Communication and Applications Conference, SIU. 2008. 4632707 https://doi.org/10.1109/SIU.2008.4632707
Demir, Y. ; Ofli, Ferda ; Erzin, E. ; Yemez, Y. ; Tekalp, Ve A M. / Dans fi̇gürleri̇ni̇n i̇şi̇tsel-görsel anali̇zi̇ i̇çi̇n i̇şi̇tsel özni̇teli̇kleri̇n deǧerlendi̇ri̇lmesi̇. 2008 IEEE 16th Signal Processing, Communication and Applications Conference, SIU. 2008.
@inproceedings{c3f580d7bcd344498ce17caa3efbb8a0,
title = "Dans fi̇g{\"u}rleri̇ni̇n i̇şi̇tsel-g{\"o}rsel anali̇zi̇ i̇{\cc}i̇n i̇şi̇tsel {\"o}zni̇teli̇kleri̇n deǧerlendi̇ri̇lmesi̇",
abstract = "We present a framework for selecting best audio features for audiovisual analysis and synthesis of dance figures. Dance figures are performed synchronously with the musical rhythm. They can be analyzed through the audio spectra using spectral and rhythmic musical features. In the proposed audio feature evaluation system, dance figures are manually labeled over the video stream. The music segments, which correspond to labeled dance figures, are used to train hidden Markov model (HMM) structures to learn temporal spectrum patterns for the dance figures. The dance figure recognition performances of the HMM models for various spectral feature sets are evaluated. Audio features, which are maximizing dance figure recognition performances, are selected as the best audio features for the analyzed audiovisual dance recordings. In our evaluations, mel-scale cepstral coefficients (MFCC) with their first and second derivatives, spectral centroid, spectral flux and spectral roll-off are used as candidate audio features. Selection of the best audio features can be used towards analysis and synthesis of audio-driven body animation.",
keywords = "Audio-driven body animation, Audio-visual analysis",
author = "Y. Demir and Ferda Ofli and E. Erzin and Y. Yemez and Tekalp, {Ve A M}",
year = "2008",
month = "11",
day = "26",
doi = "10.1109/SIU.2008.4632707",
language = "Undefined/Unknown",
isbn = "9781424419999",
booktitle = "2008 IEEE 16th Signal Processing, Communication and Applications Conference, SIU",

}

TY - GEN

T1 - Dans fi̇gürleri̇ni̇n i̇şi̇tsel-görsel anali̇zi̇ i̇çi̇n i̇şi̇tsel özni̇teli̇kleri̇n deǧerlendi̇ri̇lmesi̇

AU - Demir, Y.

AU - Ofli, Ferda

AU - Erzin, E.

AU - Yemez, Y.

AU - Tekalp, Ve A M

PY - 2008/11/26

Y1 - 2008/11/26

N2 - We present a framework for selecting best audio features for audiovisual analysis and synthesis of dance figures. Dance figures are performed synchronously with the musical rhythm. They can be analyzed through the audio spectra using spectral and rhythmic musical features. In the proposed audio feature evaluation system, dance figures are manually labeled over the video stream. The music segments, which correspond to labeled dance figures, are used to train hidden Markov model (HMM) structures to learn temporal spectrum patterns for the dance figures. The dance figure recognition performances of the HMM models for various spectral feature sets are evaluated. Audio features, which are maximizing dance figure recognition performances, are selected as the best audio features for the analyzed audiovisual dance recordings. In our evaluations, mel-scale cepstral coefficients (MFCC) with their first and second derivatives, spectral centroid, spectral flux and spectral roll-off are used as candidate audio features. Selection of the best audio features can be used towards analysis and synthesis of audio-driven body animation.

AB - We present a framework for selecting best audio features for audiovisual analysis and synthesis of dance figures. Dance figures are performed synchronously with the musical rhythm. They can be analyzed through the audio spectra using spectral and rhythmic musical features. In the proposed audio feature evaluation system, dance figures are manually labeled over the video stream. The music segments, which correspond to labeled dance figures, are used to train hidden Markov model (HMM) structures to learn temporal spectrum patterns for the dance figures. The dance figure recognition performances of the HMM models for various spectral feature sets are evaluated. Audio features, which are maximizing dance figure recognition performances, are selected as the best audio features for the analyzed audiovisual dance recordings. In our evaluations, mel-scale cepstral coefficients (MFCC) with their first and second derivatives, spectral centroid, spectral flux and spectral roll-off are used as candidate audio features. Selection of the best audio features can be used towards analysis and synthesis of audio-driven body animation.

KW - Audio-driven body animation

KW - Audio-visual analysis

UR - http://www.scopus.com/inward/record.url?scp=56449097955&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=56449097955&partnerID=8YFLogxK

U2 - 10.1109/SIU.2008.4632707

DO - 10.1109/SIU.2008.4632707

M3 - Conference contribution

SN - 9781424419999

BT - 2008 IEEE 16th Signal Processing, Communication and Applications Conference, SIU

ER -