This paper proposes a voice activity detector (VAD) that makes the speech/noise classification by applying the statistical chi-square test to each frame. It also uses a continuous update of the background noise estimate. The speech is first enhanced using a noise reduction system, with noise estimates also obtained with the help of the chisquare test. The noise-reduced signal is decomposed into sub-bands, and the chi-square test is used again in another form to compare the observed signal distribution to the estimated noise distribution. If the chi-square test determines that they are close, the frame is declared to be noise, otherwise speech. The performance of this VAD was found to be significantly superior to several benchmark VADs, with accuracies above 89% even at a SNR of 0 dB, which is up to 25% better than the others.
|Journal||ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings|
|Publication status||Published - 28 Sep 2004|
|Event||Proceedings - IEEE International Conference on Acoustics, Speech, and Signal Processing - Montreal, Que, Canada|
Duration: 17 May 2004 → 21 May 2004
ASJC Scopus subject areas
- Signal Processing
- Electrical and Electronic Engineering