A voice activity detector using the chi-square test

Beena Ahmed, W. Harvey Holmes

Research output: Contribution to journalArticle

11 Citations (Scopus)

Abstract

This paper proposes a voice activity detector (VAD) that makes the speech/noise classification by applying the statistical chi-square test to each frame. It also uses a continuous update of the background noise estimate. The speech is first enhanced using a noise reduction system, with noise estimates also obtained with the help of the chisquare test. The noise-reduced signal is decomposed into sub-bands, and the chi-square test is used again in another form to compare the observed signal distribution to the estimated noise distribution. If the chi-square test determines that they are close, the frame is declared to be noise, otherwise speech. The performance of this VAD was found to be significantly superior to several benchmark VADs, with accuracies above 89% even at a SNR of 0 dB, which is up to 25% better than the others.

Original languageEnglish
JournalICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume1
Publication statusPublished - 2004
Externally publishedYes

Fingerprint

Detectors
detectors
background noise
estimates
Noise abatement
noise reduction
Acoustic noise

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Signal Processing
  • Acoustics and Ultrasonics

Cite this

@article{2629239bbe124f4682ecda930540f59d,
title = "A voice activity detector using the chi-square test",
abstract = "This paper proposes a voice activity detector (VAD) that makes the speech/noise classification by applying the statistical chi-square test to each frame. It also uses a continuous update of the background noise estimate. The speech is first enhanced using a noise reduction system, with noise estimates also obtained with the help of the chisquare test. The noise-reduced signal is decomposed into sub-bands, and the chi-square test is used again in another form to compare the observed signal distribution to the estimated noise distribution. If the chi-square test determines that they are close, the frame is declared to be noise, otherwise speech. The performance of this VAD was found to be significantly superior to several benchmark VADs, with accuracies above 89{\%} even at a SNR of 0 dB, which is up to 25{\%} better than the others.",
author = "Beena Ahmed and Holmes, {W. Harvey}",
year = "2004",
language = "English",
volume = "1",
journal = "Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing",
issn = "0736-7791",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - A voice activity detector using the chi-square test

AU - Ahmed, Beena

AU - Holmes, W. Harvey

PY - 2004

Y1 - 2004

N2 - This paper proposes a voice activity detector (VAD) that makes the speech/noise classification by applying the statistical chi-square test to each frame. It also uses a continuous update of the background noise estimate. The speech is first enhanced using a noise reduction system, with noise estimates also obtained with the help of the chisquare test. The noise-reduced signal is decomposed into sub-bands, and the chi-square test is used again in another form to compare the observed signal distribution to the estimated noise distribution. If the chi-square test determines that they are close, the frame is declared to be noise, otherwise speech. The performance of this VAD was found to be significantly superior to several benchmark VADs, with accuracies above 89% even at a SNR of 0 dB, which is up to 25% better than the others.

AB - This paper proposes a voice activity detector (VAD) that makes the speech/noise classification by applying the statistical chi-square test to each frame. It also uses a continuous update of the background noise estimate. The speech is first enhanced using a noise reduction system, with noise estimates also obtained with the help of the chisquare test. The noise-reduced signal is decomposed into sub-bands, and the chi-square test is used again in another form to compare the observed signal distribution to the estimated noise distribution. If the chi-square test determines that they are close, the frame is declared to be noise, otherwise speech. The performance of this VAD was found to be significantly superior to several benchmark VADs, with accuracies above 89% even at a SNR of 0 dB, which is up to 25% better than the others.

UR - http://www.scopus.com/inward/record.url?scp=4544260272&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=4544260272&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:4544260272

VL - 1

JO - Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing

JF - Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing

SN - 0736-7791

ER -