Comparison of GMM-HMM and DNN-HMM based pronunciation verification techniques for use in the assessment of childhood apraxia of speech

Mostafa Shahin, Beena Ahmed, Jacqueline McKechnie, Kirrie Ballard, Ricardo Gutierrez-Osuna

Research output: Contribution to journalArticle

9 Citations (Scopus)

Abstract

This paper introduces a pronunciation verification method to be used in an automatic assessment therapy tool of child disordered speech. The proposed method creates a phonebased search lattice that is flexible enough to cover all probable mispronunciations. This allows us to verify the correctness of the pronunciation and detect the incorrect phonemes produced by the child. We compare between two different acoustic models, the conventional GMM-HMM and the hybrid DNN-HMM. Results show that the hybrid DNNHMM outperforms the conventional GMM-HMM for all experiments on both normal and disordered speech. The total correctness accuracy of the system at the phoneme level is above 85% when used with disordered speech.

Original languageEnglish
Pages (from-to)1583-1587
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Publication statusPublished - 2014

Fingerprint

Correctness
Acoustic Model
Probable
Therapy
Acoustics
Cover
Verify
Experiment
Speech
Childhood Apraxia of Speech
Hidden Markov Model
Experiments
Children
Conventional
Phoneme
Mispronunciations

Keywords

  • Automatic speech recognition
  • Computer aided pronunciation learning
  • Deep learning
  • Pronunciation verification
  • Speech therapy

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modelling and Simulation

Cite this

@article{67cdbdce15b94a8aa1ffdeff36a87fea,
title = "Comparison of GMM-HMM and DNN-HMM based pronunciation verification techniques for use in the assessment of childhood apraxia of speech",
abstract = "This paper introduces a pronunciation verification method to be used in an automatic assessment therapy tool of child disordered speech. The proposed method creates a phonebased search lattice that is flexible enough to cover all probable mispronunciations. This allows us to verify the correctness of the pronunciation and detect the incorrect phonemes produced by the child. We compare between two different acoustic models, the conventional GMM-HMM and the hybrid DNN-HMM. Results show that the hybrid DNNHMM outperforms the conventional GMM-HMM for all experiments on both normal and disordered speech. The total correctness accuracy of the system at the phoneme level is above 85{\%} when used with disordered speech.",
keywords = "Automatic speech recognition, Computer aided pronunciation learning, Deep learning, Pronunciation verification, Speech therapy",
author = "Mostafa Shahin and Beena Ahmed and Jacqueline McKechnie and Kirrie Ballard and Ricardo Gutierrez-Osuna",
year = "2014",
language = "English",
pages = "1583--1587",
journal = "Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH",
issn = "2308-457X",

}

TY - JOUR

T1 - Comparison of GMM-HMM and DNN-HMM based pronunciation verification techniques for use in the assessment of childhood apraxia of speech

AU - Shahin, Mostafa

AU - Ahmed, Beena

AU - McKechnie, Jacqueline

AU - Ballard, Kirrie

AU - Gutierrez-Osuna, Ricardo

PY - 2014

Y1 - 2014

N2 - This paper introduces a pronunciation verification method to be used in an automatic assessment therapy tool of child disordered speech. The proposed method creates a phonebased search lattice that is flexible enough to cover all probable mispronunciations. This allows us to verify the correctness of the pronunciation and detect the incorrect phonemes produced by the child. We compare between two different acoustic models, the conventional GMM-HMM and the hybrid DNN-HMM. Results show that the hybrid DNNHMM outperforms the conventional GMM-HMM for all experiments on both normal and disordered speech. The total correctness accuracy of the system at the phoneme level is above 85% when used with disordered speech.

AB - This paper introduces a pronunciation verification method to be used in an automatic assessment therapy tool of child disordered speech. The proposed method creates a phonebased search lattice that is flexible enough to cover all probable mispronunciations. This allows us to verify the correctness of the pronunciation and detect the incorrect phonemes produced by the child. We compare between two different acoustic models, the conventional GMM-HMM and the hybrid DNN-HMM. Results show that the hybrid DNNHMM outperforms the conventional GMM-HMM for all experiments on both normal and disordered speech. The total correctness accuracy of the system at the phoneme level is above 85% when used with disordered speech.

KW - Automatic speech recognition

KW - Computer aided pronunciation learning

KW - Deep learning

KW - Pronunciation verification

KW - Speech therapy

UR - http://www.scopus.com/inward/record.url?scp=84910091933&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84910091933&partnerID=8YFLogxK

M3 - Article

SP - 1583

EP - 1587

JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

SN - 2308-457X

ER -