Multi-reference WER for evaluating ASR for languages with no orthographic rules

Ahmed Ali, Walid Magdy, Peter Bell, Steve Renais

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

Languages with no standard orthographic representation faces a challenge to evaluate the output from Automatic Speech Recognition (ASR). Since the reference transcription text can vary widely from one user to another. We propose an innovative approach for evaluating speech recognition using Multi-References. For each recognized speech segments, we ask five different users to transcribe the speech. We combine the alignment for the multiple references, and use the combined alignment to report a modified version of Word Error Rate (WER). This approach is in favor of accepting a recognized word if any of the references typed it in the same form. Results are reported using two Dialectal Arabic (DA) as a language with no standard orthographic; Egyptian, and North African speech. The average WER for the five references individually is 71.4%, and 80.1% respectively. When considering all references combined, the Multi-References MR-WER was found to be 39.7%, and 45.9% respectively.

Original languageEnglish
Title of host publication2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages576-580
Number of pages5
ISBN (Print)9781479972913
DOIs
Publication statusPublished - 10 Feb 2016
EventIEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Scottsdale, United States
Duration: 13 Dec 201517 Dec 2015

Other

OtherIEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015
CountryUnited States
CityScottsdale
Period13/12/1517/12/15

Fingerprint

Speech recognition
Transcription

Keywords

  • Under-Resource
  • WER

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition

Cite this

Ali, A., Magdy, W., Bell, P., & Renais, S. (2016). Multi-reference WER for evaluating ASR for languages with no orthographic rules. In 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings (pp. 576-580). [7404847] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ASRU.2015.7404847

Multi-reference WER for evaluating ASR for languages with no orthographic rules. / Ali, Ahmed; Magdy, Walid; Bell, Peter; Renais, Steve.

2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2016. p. 576-580 7404847.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Ali, A, Magdy, W, Bell, P & Renais, S 2016, Multi-reference WER for evaluating ASR for languages with no orthographic rules. in 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings., 7404847, Institute of Electrical and Electronics Engineers Inc., pp. 576-580, IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, United States, 13/12/15. https://doi.org/10.1109/ASRU.2015.7404847
Ali A, Magdy W, Bell P, Renais S. Multi-reference WER for evaluating ASR for languages with no orthographic rules. In 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2016. p. 576-580. 7404847 https://doi.org/10.1109/ASRU.2015.7404847
Ali, Ahmed ; Magdy, Walid ; Bell, Peter ; Renais, Steve. / Multi-reference WER for evaluating ASR for languages with no orthographic rules. 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2016. pp. 576-580
@inproceedings{fb65a5b3979d45dca6e4d728e7b09fcb,
title = "Multi-reference WER for evaluating ASR for languages with no orthographic rules",
abstract = "Languages with no standard orthographic representation faces a challenge to evaluate the output from Automatic Speech Recognition (ASR). Since the reference transcription text can vary widely from one user to another. We propose an innovative approach for evaluating speech recognition using Multi-References. For each recognized speech segments, we ask five different users to transcribe the speech. We combine the alignment for the multiple references, and use the combined alignment to report a modified version of Word Error Rate (WER). This approach is in favor of accepting a recognized word if any of the references typed it in the same form. Results are reported using two Dialectal Arabic (DA) as a language with no standard orthographic; Egyptian, and North African speech. The average WER for the five references individually is 71.4{\%}, and 80.1{\%} respectively. When considering all references combined, the Multi-References MR-WER was found to be 39.7{\%}, and 45.9{\%} respectively.",
keywords = "Under-Resource, WER",
author = "Ahmed Ali and Walid Magdy and Peter Bell and Steve Renais",
year = "2016",
month = "2",
day = "10",
doi = "10.1109/ASRU.2015.7404847",
language = "English",
isbn = "9781479972913",
pages = "576--580",
booktitle = "2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - Multi-reference WER for evaluating ASR for languages with no orthographic rules

AU - Ali, Ahmed

AU - Magdy, Walid

AU - Bell, Peter

AU - Renais, Steve

PY - 2016/2/10

Y1 - 2016/2/10

N2 - Languages with no standard orthographic representation faces a challenge to evaluate the output from Automatic Speech Recognition (ASR). Since the reference transcription text can vary widely from one user to another. We propose an innovative approach for evaluating speech recognition using Multi-References. For each recognized speech segments, we ask five different users to transcribe the speech. We combine the alignment for the multiple references, and use the combined alignment to report a modified version of Word Error Rate (WER). This approach is in favor of accepting a recognized word if any of the references typed it in the same form. Results are reported using two Dialectal Arabic (DA) as a language with no standard orthographic; Egyptian, and North African speech. The average WER for the five references individually is 71.4%, and 80.1% respectively. When considering all references combined, the Multi-References MR-WER was found to be 39.7%, and 45.9% respectively.

AB - Languages with no standard orthographic representation faces a challenge to evaluate the output from Automatic Speech Recognition (ASR). Since the reference transcription text can vary widely from one user to another. We propose an innovative approach for evaluating speech recognition using Multi-References. For each recognized speech segments, we ask five different users to transcribe the speech. We combine the alignment for the multiple references, and use the combined alignment to report a modified version of Word Error Rate (WER). This approach is in favor of accepting a recognized word if any of the references typed it in the same form. Results are reported using two Dialectal Arabic (DA) as a language with no standard orthographic; Egyptian, and North African speech. The average WER for the five references individually is 71.4%, and 80.1% respectively. When considering all references combined, the Multi-References MR-WER was found to be 39.7%, and 45.9% respectively.

KW - Under-Resource

KW - WER

UR - http://www.scopus.com/inward/record.url?scp=84964440067&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84964440067&partnerID=8YFLogxK

U2 - 10.1109/ASRU.2015.7404847

DO - 10.1109/ASRU.2015.7404847

M3 - Conference contribution

SN - 9781479972913

SP - 576

EP - 580

BT - 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

ER -