Automatic classification of unequal lexical stress patterns using machine learning algorithms

Mostafa Ali Shahin, Beena Ahmed, Kirrie J. Ballard

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Citations (Scopus)

Abstract

Technology based speech therapy systems are severely handicapped due to the absence of accurate prosodic event identification algorithms. This paper introduces an automatic method for the classification of strong-weak (SW) and weak-strong (WS) stress patterns in children speech with American English accent, for use in the assessment of the speech dysprosody. We investigate the ability of two sets of features used to train classifiers to identify the variation in lexical stress between two consecutive syllables. The first set consists of traditional features derived from measurements of pitch, intensity and duration, whereas the second set consists of energies of different filter banks. Three different classifiers were used in the experiments: an Artificial Neural Network (ANN) classifier with a single hidden layer, Support Vector Machine (SVM) classifier with both linear and Gaussian kernels and the Maximum Entropy modeling (MaxEnt). these features. Best results were obtained using an ANN classifier and a combination of the two sets of features. The system correctly classified 94% of the SW stress patterns and 76% of the WS stress patterns.

Original languageEnglish
Title of host publication2012 IEEE Workshop on Spoken Language Technology, SLT 2012 - Proceedings
Pages388-391
Number of pages4
DOIs
Publication statusPublished - 2012
Event2012 IEEE Workshop on Spoken Language Technology, SLT 2012 - Miami, FL, United States
Duration: 2 Dec 20125 Dec 2012

Other

Other2012 IEEE Workshop on Spoken Language Technology, SLT 2012
CountryUnited States
CityMiami, FL
Period2/12/125/12/12

Fingerprint

neural network
speech therapy
handicapped
entropy
learning
bank
energy
event
experiment
ability
Lexical Stress
Machine Learning
Stress Patterns
Classifier
Artificial Neural Network

Keywords

  • automatic assessment
  • lexical stress
  • prosody

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Cite this

Shahin, M. A., Ahmed, B., & Ballard, K. J. (2012). Automatic classification of unequal lexical stress patterns using machine learning algorithms. In 2012 IEEE Workshop on Spoken Language Technology, SLT 2012 - Proceedings (pp. 388-391). [6424255] https://doi.org/10.1109/SLT.2012.6424255

Automatic classification of unequal lexical stress patterns using machine learning algorithms. / Shahin, Mostafa Ali; Ahmed, Beena; Ballard, Kirrie J.

2012 IEEE Workshop on Spoken Language Technology, SLT 2012 - Proceedings. 2012. p. 388-391 6424255.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Shahin, MA, Ahmed, B & Ballard, KJ 2012, Automatic classification of unequal lexical stress patterns using machine learning algorithms. in 2012 IEEE Workshop on Spoken Language Technology, SLT 2012 - Proceedings., 6424255, pp. 388-391, 2012 IEEE Workshop on Spoken Language Technology, SLT 2012, Miami, FL, United States, 2/12/12. https://doi.org/10.1109/SLT.2012.6424255
Shahin MA, Ahmed B, Ballard KJ. Automatic classification of unequal lexical stress patterns using machine learning algorithms. In 2012 IEEE Workshop on Spoken Language Technology, SLT 2012 - Proceedings. 2012. p. 388-391. 6424255 https://doi.org/10.1109/SLT.2012.6424255
Shahin, Mostafa Ali ; Ahmed, Beena ; Ballard, Kirrie J. / Automatic classification of unequal lexical stress patterns using machine learning algorithms. 2012 IEEE Workshop on Spoken Language Technology, SLT 2012 - Proceedings. 2012. pp. 388-391
@inproceedings{3b6cc3ae86874159be9aac2e6052f462,
title = "Automatic classification of unequal lexical stress patterns using machine learning algorithms",
abstract = "Technology based speech therapy systems are severely handicapped due to the absence of accurate prosodic event identification algorithms. This paper introduces an automatic method for the classification of strong-weak (SW) and weak-strong (WS) stress patterns in children speech with American English accent, for use in the assessment of the speech dysprosody. We investigate the ability of two sets of features used to train classifiers to identify the variation in lexical stress between two consecutive syllables. The first set consists of traditional features derived from measurements of pitch, intensity and duration, whereas the second set consists of energies of different filter banks. Three different classifiers were used in the experiments: an Artificial Neural Network (ANN) classifier with a single hidden layer, Support Vector Machine (SVM) classifier with both linear and Gaussian kernels and the Maximum Entropy modeling (MaxEnt). these features. Best results were obtained using an ANN classifier and a combination of the two sets of features. The system correctly classified 94{\%} of the SW stress patterns and 76{\%} of the WS stress patterns.",
keywords = "automatic assessment, lexical stress, prosody",
author = "Shahin, {Mostafa Ali} and Beena Ahmed and Ballard, {Kirrie J.}",
year = "2012",
doi = "10.1109/SLT.2012.6424255",
language = "English",
isbn = "9781467351263",
pages = "388--391",
booktitle = "2012 IEEE Workshop on Spoken Language Technology, SLT 2012 - Proceedings",

}

TY - GEN

T1 - Automatic classification of unequal lexical stress patterns using machine learning algorithms

AU - Shahin, Mostafa Ali

AU - Ahmed, Beena

AU - Ballard, Kirrie J.

PY - 2012

Y1 - 2012

N2 - Technology based speech therapy systems are severely handicapped due to the absence of accurate prosodic event identification algorithms. This paper introduces an automatic method for the classification of strong-weak (SW) and weak-strong (WS) stress patterns in children speech with American English accent, for use in the assessment of the speech dysprosody. We investigate the ability of two sets of features used to train classifiers to identify the variation in lexical stress between two consecutive syllables. The first set consists of traditional features derived from measurements of pitch, intensity and duration, whereas the second set consists of energies of different filter banks. Three different classifiers were used in the experiments: an Artificial Neural Network (ANN) classifier with a single hidden layer, Support Vector Machine (SVM) classifier with both linear and Gaussian kernels and the Maximum Entropy modeling (MaxEnt). these features. Best results were obtained using an ANN classifier and a combination of the two sets of features. The system correctly classified 94% of the SW stress patterns and 76% of the WS stress patterns.

AB - Technology based speech therapy systems are severely handicapped due to the absence of accurate prosodic event identification algorithms. This paper introduces an automatic method for the classification of strong-weak (SW) and weak-strong (WS) stress patterns in children speech with American English accent, for use in the assessment of the speech dysprosody. We investigate the ability of two sets of features used to train classifiers to identify the variation in lexical stress between two consecutive syllables. The first set consists of traditional features derived from measurements of pitch, intensity and duration, whereas the second set consists of energies of different filter banks. Three different classifiers were used in the experiments: an Artificial Neural Network (ANN) classifier with a single hidden layer, Support Vector Machine (SVM) classifier with both linear and Gaussian kernels and the Maximum Entropy modeling (MaxEnt). these features. Best results were obtained using an ANN classifier and a combination of the two sets of features. The system correctly classified 94% of the SW stress patterns and 76% of the WS stress patterns.

KW - automatic assessment

KW - lexical stress

KW - prosody

UR - http://www.scopus.com/inward/record.url?scp=84874242315&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84874242315&partnerID=8YFLogxK

U2 - 10.1109/SLT.2012.6424255

DO - 10.1109/SLT.2012.6424255

M3 - Conference contribution

SN - 9781467351263

SP - 388

EP - 391

BT - 2012 IEEE Workshop on Spoken Language Technology, SLT 2012 - Proceedings

ER -