Anomaly detection based pronunciation verification approach using speech attribute features

Mostafa Shahin, Beena Ahmed

Research output: Contribution to journalArticle


Computer aided pronunciation training tools require accurate automatic pronunciation error detection algorithms to identify errors made by their users. However, the performance of these algorithms is highly dependent on the amount of mispronounced speech data used to train them and the reliability of its manual annotation. To overcome this problem, we turned the mispronunciation detection into an anomaly detection problem, which utilize algorithms trained with only correctly pronounced speech data. In this work we adopted the One-Class SVM as our anomaly detection model, with a specific model built for each phoneme. Each model was fed with a set of speech attribute features, namely the manners and places of articulation, extracted from a bank of binary DNN speech attribute detectors. We also applied multi-task learning and dropout approaches to alleviate the overfitting problem in the DNN speech attribute detectors. We trained the system using the WSJ0 and TIMIT standard data sets which contain only native English speech data and then evaluated it using three different data sets, a native English speaker corpus with artificial errors, a foreign-accented speech corpus and a children's disordered speech corpus. Finally, we compared our system with the conventional Goodness-of-Pronunciation (GOP) algorithm to demonstrate the effectiveness of our method. The results show that our method reduced the false-acceptance and false-rejection rates by 26% and 39% respectively compared to the GOP method.

Original languageEnglish
Pages (from-to)29-43
Number of pages15
JournalSpeech Communication
Publication statusPublished - 1 Aug 2019



  • Anomaly detection
  • One class SVM
  • Pronunciation verification
  • Speech attributes

ASJC Scopus subject areas

  • Software
  • Modelling and Simulation
  • Communication
  • Language and Linguistics
  • Linguistics and Language
  • Computer Vision and Pattern Recognition
  • Computer Science Applications

Cite this