Tabby Talks: An automated tool for the assessment of childhood apraxia of speech

Mostafa Shahin, Beena Ahmed, Avinash Parnandi, Virendra Karappa, Jacqueline McKechnie, Kirrie J. Ballard, Ricardo Gutierrez-Osuna

Research output: Contribution to journalArticle

18 Citations (Scopus)

Abstract

Children with developmental disabilities such as childhood apraxia of speech (CAS) require repeated intervention sessions with a speech therapist, sometimes extending over several years. Technology-based therapy tools offer the potential to reduce the demanding workload of speech therapists as well as time and cost for families. In response to this need, we have developed "Tabby Talks," a multi-tier system for remote administration of speech therapy. This paper describes the speech processing pipeline to automatically detect common errors associated with CAS. The pipeline contains modules for voice activity detection, pronunciation verification, and lexical stress verification. The voice activity detector evaluates the intensity contour of an utterance and compares it against an adaptive threshold to detect silence segments and measure voicing delays and total production time. The pronunciation verification module uses a generic search lattice structure with multiple internal paths that covers all possible pronunciation errors (substitutions, insertions and deletions) in the child's production. Finally, the lexical stress verification module classifies the lexical stress across consecutive syllables into strong-weak or weak-strong patterns using a combination of prosodic and spectral measures. These error measures can be provided to the therapist through a web interface, to enable them to adapt the child's therapy program remotely. When evaluated on a dataset of typically developing and disordered speech from children ages 4-16 years, the system achieves a pronunciation verification accuracy of 88.2% at the phoneme level and 80.7% at the utterance level, and lexical stress classification rate of 83.3%.

Original languageEnglish
Pages (from-to)49-64
Number of pages16
JournalSpeech Communication
Volume70
DOIs
Publication statusPublished - 1 Jun 2015

Fingerprint

speech therapist
childhood
Therapy
speech therapy
Module
Voice Activity Detection
substitution
workload
therapist
Adaptive Threshold
Speech Processing
Spectral Measure
Lattice Structure
Pipelines
disability
Disability
Speech processing
Deletion
Insertion
Workload

Keywords

  • Automatic speech recognition
  • Computer aided pronunciation learning
  • Pronunciation verification
  • Prosody
  • Speech therapy

ASJC Scopus subject areas

  • Software
  • Language and Linguistics
  • Modelling and Simulation
  • Communication
  • Linguistics and Language
  • Computer Vision and Pattern Recognition
  • Computer Science Applications

Cite this

Shahin, M., Ahmed, B., Parnandi, A., Karappa, V., McKechnie, J., Ballard, K. J., & Gutierrez-Osuna, R. (2015). Tabby Talks: An automated tool for the assessment of childhood apraxia of speech. Speech Communication, 70, 49-64. https://doi.org/10.1016/j.specom.2015.04.002

Tabby Talks : An automated tool for the assessment of childhood apraxia of speech. / Shahin, Mostafa; Ahmed, Beena; Parnandi, Avinash; Karappa, Virendra; McKechnie, Jacqueline; Ballard, Kirrie J.; Gutierrez-Osuna, Ricardo.

In: Speech Communication, Vol. 70, 01.06.2015, p. 49-64.

Research output: Contribution to journalArticle

Shahin, M, Ahmed, B, Parnandi, A, Karappa, V, McKechnie, J, Ballard, KJ & Gutierrez-Osuna, R 2015, 'Tabby Talks: An automated tool for the assessment of childhood apraxia of speech', Speech Communication, vol. 70, pp. 49-64. https://doi.org/10.1016/j.specom.2015.04.002
Shahin, Mostafa ; Ahmed, Beena ; Parnandi, Avinash ; Karappa, Virendra ; McKechnie, Jacqueline ; Ballard, Kirrie J. ; Gutierrez-Osuna, Ricardo. / Tabby Talks : An automated tool for the assessment of childhood apraxia of speech. In: Speech Communication. 2015 ; Vol. 70. pp. 49-64.
@article{a4537479d4804ceb8480d2295b86a900,
title = "Tabby Talks: An automated tool for the assessment of childhood apraxia of speech",
abstract = "Children with developmental disabilities such as childhood apraxia of speech (CAS) require repeated intervention sessions with a speech therapist, sometimes extending over several years. Technology-based therapy tools offer the potential to reduce the demanding workload of speech therapists as well as time and cost for families. In response to this need, we have developed {"}Tabby Talks,{"} a multi-tier system for remote administration of speech therapy. This paper describes the speech processing pipeline to automatically detect common errors associated with CAS. The pipeline contains modules for voice activity detection, pronunciation verification, and lexical stress verification. The voice activity detector evaluates the intensity contour of an utterance and compares it against an adaptive threshold to detect silence segments and measure voicing delays and total production time. The pronunciation verification module uses a generic search lattice structure with multiple internal paths that covers all possible pronunciation errors (substitutions, insertions and deletions) in the child's production. Finally, the lexical stress verification module classifies the lexical stress across consecutive syllables into strong-weak or weak-strong patterns using a combination of prosodic and spectral measures. These error measures can be provided to the therapist through a web interface, to enable them to adapt the child's therapy program remotely. When evaluated on a dataset of typically developing and disordered speech from children ages 4-16 years, the system achieves a pronunciation verification accuracy of 88.2{\%} at the phoneme level and 80.7{\%} at the utterance level, and lexical stress classification rate of 83.3{\%}.",
keywords = "Automatic speech recognition, Computer aided pronunciation learning, Pronunciation verification, Prosody, Speech therapy",
author = "Mostafa Shahin and Beena Ahmed and Avinash Parnandi and Virendra Karappa and Jacqueline McKechnie and Ballard, {Kirrie J.} and Ricardo Gutierrez-Osuna",
year = "2015",
month = "6",
day = "1",
doi = "10.1016/j.specom.2015.04.002",
language = "English",
volume = "70",
pages = "49--64",
journal = "Speech Communication",
issn = "0167-6393",
publisher = "Elsevier",

}

TY - JOUR

T1 - Tabby Talks

T2 - An automated tool for the assessment of childhood apraxia of speech

AU - Shahin, Mostafa

AU - Ahmed, Beena

AU - Parnandi, Avinash

AU - Karappa, Virendra

AU - McKechnie, Jacqueline

AU - Ballard, Kirrie J.

AU - Gutierrez-Osuna, Ricardo

PY - 2015/6/1

Y1 - 2015/6/1

N2 - Children with developmental disabilities such as childhood apraxia of speech (CAS) require repeated intervention sessions with a speech therapist, sometimes extending over several years. Technology-based therapy tools offer the potential to reduce the demanding workload of speech therapists as well as time and cost for families. In response to this need, we have developed "Tabby Talks," a multi-tier system for remote administration of speech therapy. This paper describes the speech processing pipeline to automatically detect common errors associated with CAS. The pipeline contains modules for voice activity detection, pronunciation verification, and lexical stress verification. The voice activity detector evaluates the intensity contour of an utterance and compares it against an adaptive threshold to detect silence segments and measure voicing delays and total production time. The pronunciation verification module uses a generic search lattice structure with multiple internal paths that covers all possible pronunciation errors (substitutions, insertions and deletions) in the child's production. Finally, the lexical stress verification module classifies the lexical stress across consecutive syllables into strong-weak or weak-strong patterns using a combination of prosodic and spectral measures. These error measures can be provided to the therapist through a web interface, to enable them to adapt the child's therapy program remotely. When evaluated on a dataset of typically developing and disordered speech from children ages 4-16 years, the system achieves a pronunciation verification accuracy of 88.2% at the phoneme level and 80.7% at the utterance level, and lexical stress classification rate of 83.3%.

AB - Children with developmental disabilities such as childhood apraxia of speech (CAS) require repeated intervention sessions with a speech therapist, sometimes extending over several years. Technology-based therapy tools offer the potential to reduce the demanding workload of speech therapists as well as time and cost for families. In response to this need, we have developed "Tabby Talks," a multi-tier system for remote administration of speech therapy. This paper describes the speech processing pipeline to automatically detect common errors associated with CAS. The pipeline contains modules for voice activity detection, pronunciation verification, and lexical stress verification. The voice activity detector evaluates the intensity contour of an utterance and compares it against an adaptive threshold to detect silence segments and measure voicing delays and total production time. The pronunciation verification module uses a generic search lattice structure with multiple internal paths that covers all possible pronunciation errors (substitutions, insertions and deletions) in the child's production. Finally, the lexical stress verification module classifies the lexical stress across consecutive syllables into strong-weak or weak-strong patterns using a combination of prosodic and spectral measures. These error measures can be provided to the therapist through a web interface, to enable them to adapt the child's therapy program remotely. When evaluated on a dataset of typically developing and disordered speech from children ages 4-16 years, the system achieves a pronunciation verification accuracy of 88.2% at the phoneme level and 80.7% at the utterance level, and lexical stress classification rate of 83.3%.

KW - Automatic speech recognition

KW - Computer aided pronunciation learning

KW - Pronunciation verification

KW - Prosody

KW - Speech therapy

UR - http://www.scopus.com/inward/record.url?scp=84928481301&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84928481301&partnerID=8YFLogxK

U2 - 10.1016/j.specom.2015.04.002

DO - 10.1016/j.specom.2015.04.002

M3 - Article

AN - SCOPUS:84928481301

VL - 70

SP - 49

EP - 64

JO - Speech Communication

JF - Speech Communication

SN - 0167-6393

ER -