Classification of lexical stress patterns using deep neural network architecture

Mostafa Ali Shahin, Beena Ahmed, Kirrie J. Ballard

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

Lexical stress is a key diagnostic marker of disordered speech as it strongly affects speech perception. In this paper we introduce an automated method to classify between the different lexical stress patterns in children's speech. A deep neural network is used to classify between strong-weak (SW), weak-strong (WS) and equal-stress (SS/WW) patterns in English by measuring the articulation change between the two successive syllables. The deep neural network architecture is trained using a set of acoustic features derived from pitch, duration and intensity measurements along with the energies in different frequency bands. We compared the performance of the deep neural classifier to a traditional single hidden layer MLP. Results show that the deep neural classifier outperforms the traditional MLP. The accuracy of the deep neural system is approximately 85% when classifying between the unequal stress patterns (SW/WS) and greater than 70% when classifying both equal and unequal stress patterns.

Original languageEnglish
Title of host publication2014 IEEE Workshop on Spoken Language Technology, SLT 2014 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages478-482
Number of pages5
ISBN (Electronic)9781479971299
DOIs
Publication statusPublished - 1 Apr 2014
Event2014 IEEE Workshop on Spoken Language Technology, SLT 2014 - South Lake Tahoe, United States
Duration: 7 Dec 201410 Dec 2014

Other

Other2014 IEEE Workshop on Spoken Language Technology, SLT 2014
CountryUnited States
CitySouth Lake Tahoe
Period7/12/1410/12/14

Fingerprint

Network architecture
Classifiers
Frequency bands
Acoustics
Deep neural networks
Lexical Stress
Stress Patterns
Neural Networks
Classifier

Keywords

  • Automatic assessment
  • Deep neural network
  • Lexical stress
  • Prosody

ASJC Scopus subject areas

  • Computer Science Applications
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence
  • Language and Linguistics

Cite this

Shahin, M. A., Ahmed, B., & Ballard, K. J. (2014). Classification of lexical stress patterns using deep neural network architecture. In 2014 IEEE Workshop on Spoken Language Technology, SLT 2014 - Proceedings (pp. 478-482). [7078621] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/SLT.2014.7078621

Classification of lexical stress patterns using deep neural network architecture. / Shahin, Mostafa Ali; Ahmed, Beena; Ballard, Kirrie J.

2014 IEEE Workshop on Spoken Language Technology, SLT 2014 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2014. p. 478-482 7078621.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Shahin, MA, Ahmed, B & Ballard, KJ 2014, Classification of lexical stress patterns using deep neural network architecture. in 2014 IEEE Workshop on Spoken Language Technology, SLT 2014 - Proceedings., 7078621, Institute of Electrical and Electronics Engineers Inc., pp. 478-482, 2014 IEEE Workshop on Spoken Language Technology, SLT 2014, South Lake Tahoe, United States, 7/12/14. https://doi.org/10.1109/SLT.2014.7078621
Shahin MA, Ahmed B, Ballard KJ. Classification of lexical stress patterns using deep neural network architecture. In 2014 IEEE Workshop on Spoken Language Technology, SLT 2014 - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2014. p. 478-482. 7078621 https://doi.org/10.1109/SLT.2014.7078621
Shahin, Mostafa Ali ; Ahmed, Beena ; Ballard, Kirrie J. / Classification of lexical stress patterns using deep neural network architecture. 2014 IEEE Workshop on Spoken Language Technology, SLT 2014 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2014. pp. 478-482
@inproceedings{cb026b97d8374972928c69f9ab6cf3d9,
title = "Classification of lexical stress patterns using deep neural network architecture",
abstract = "Lexical stress is a key diagnostic marker of disordered speech as it strongly affects speech perception. In this paper we introduce an automated method to classify between the different lexical stress patterns in children's speech. A deep neural network is used to classify between strong-weak (SW), weak-strong (WS) and equal-stress (SS/WW) patterns in English by measuring the articulation change between the two successive syllables. The deep neural network architecture is trained using a set of acoustic features derived from pitch, duration and intensity measurements along with the energies in different frequency bands. We compared the performance of the deep neural classifier to a traditional single hidden layer MLP. Results show that the deep neural classifier outperforms the traditional MLP. The accuracy of the deep neural system is approximately 85{\%} when classifying between the unequal stress patterns (SW/WS) and greater than 70{\%} when classifying both equal and unequal stress patterns.",
keywords = "Automatic assessment, Deep neural network, Lexical stress, Prosody",
author = "Shahin, {Mostafa Ali} and Beena Ahmed and Ballard, {Kirrie J.}",
year = "2014",
month = "4",
day = "1",
doi = "10.1109/SLT.2014.7078621",
language = "English",
pages = "478--482",
booktitle = "2014 IEEE Workshop on Spoken Language Technology, SLT 2014 - Proceedings",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - Classification of lexical stress patterns using deep neural network architecture

AU - Shahin, Mostafa Ali

AU - Ahmed, Beena

AU - Ballard, Kirrie J.

PY - 2014/4/1

Y1 - 2014/4/1

N2 - Lexical stress is a key diagnostic marker of disordered speech as it strongly affects speech perception. In this paper we introduce an automated method to classify between the different lexical stress patterns in children's speech. A deep neural network is used to classify between strong-weak (SW), weak-strong (WS) and equal-stress (SS/WW) patterns in English by measuring the articulation change between the two successive syllables. The deep neural network architecture is trained using a set of acoustic features derived from pitch, duration and intensity measurements along with the energies in different frequency bands. We compared the performance of the deep neural classifier to a traditional single hidden layer MLP. Results show that the deep neural classifier outperforms the traditional MLP. The accuracy of the deep neural system is approximately 85% when classifying between the unequal stress patterns (SW/WS) and greater than 70% when classifying both equal and unequal stress patterns.

AB - Lexical stress is a key diagnostic marker of disordered speech as it strongly affects speech perception. In this paper we introduce an automated method to classify between the different lexical stress patterns in children's speech. A deep neural network is used to classify between strong-weak (SW), weak-strong (WS) and equal-stress (SS/WW) patterns in English by measuring the articulation change between the two successive syllables. The deep neural network architecture is trained using a set of acoustic features derived from pitch, duration and intensity measurements along with the energies in different frequency bands. We compared the performance of the deep neural classifier to a traditional single hidden layer MLP. Results show that the deep neural classifier outperforms the traditional MLP. The accuracy of the deep neural system is approximately 85% when classifying between the unequal stress patterns (SW/WS) and greater than 70% when classifying both equal and unequal stress patterns.

KW - Automatic assessment

KW - Deep neural network

KW - Lexical stress

KW - Prosody

UR - http://www.scopus.com/inward/record.url?scp=84946690933&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84946690933&partnerID=8YFLogxK

U2 - 10.1109/SLT.2014.7078621

DO - 10.1109/SLT.2014.7078621

M3 - Conference contribution

SP - 478

EP - 482

BT - 2014 IEEE Workshop on Spoken Language Technology, SLT 2014 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

ER -