A complete KALDI recipe for building Arabic speech recognition systems

Ahmed Ali, Yifan Zhang, Patrick Cardinal, Najim Dahak, Stephan Vogel, James Glass

Research output: Chapter in Book/Report/Conference proceedingConference contribution

33 Citations (Scopus)

Abstract

In this paper we present a recipe and language resources for training and testing Arabic speech recognition systems using the KALDI toolkit. We built a prototype broadcast news system using 200 hours GALE data that is publicly available through LDC. We describe in detail the decisions made in building the system: using the MADA toolkit for text normalization and vowelization; why we use 36 phonemes; how we generate pronunciations; how we build the language model. We report results using state-of-the-art modeling and decoding techniques. The scripts are released through KALDI and resources are made available on QCRI's language resources web portal. This is the first effort to share reproducible sizable training and testing results on MSA system.

Original languageEnglish
Title of host publication2014 IEEE Workshop on Spoken Language Technology, SLT 2014 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages525-529
Number of pages5
ISBN (Print)9781479971299
DOIs
Publication statusPublished - 1 Apr 2015
Event2014 IEEE Workshop on Spoken Language Technology, SLT 2014 - South Lake Tahoe, United States
Duration: 7 Dec 201410 Dec 2014

Other

Other2014 IEEE Workshop on Spoken Language Technology, SLT 2014
CountryUnited States
CitySouth Lake Tahoe
Period7/12/1410/12/14

Fingerprint

Speech recognition
Testing
Decoding
Recipes
Speech Recognition
Resources
Toolkit
Language

Keywords

  • Arabic
  • ASR system
  • GALE
  • KALDI
  • Lexicon

ASJC Scopus subject areas

  • Computer Science Applications
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence
  • Language and Linguistics

Cite this

Ali, A., Zhang, Y., Cardinal, P., Dahak, N., Vogel, S., & Glass, J. (2015). A complete KALDI recipe for building Arabic speech recognition systems. In 2014 IEEE Workshop on Spoken Language Technology, SLT 2014 - Proceedings (pp. 525-529). [7078629] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/SLT.2014.7078629

A complete KALDI recipe for building Arabic speech recognition systems. / Ali, Ahmed; Zhang, Yifan; Cardinal, Patrick; Dahak, Najim; Vogel, Stephan; Glass, James.

2014 IEEE Workshop on Spoken Language Technology, SLT 2014 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2015. p. 525-529 7078629.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Ali, A, Zhang, Y, Cardinal, P, Dahak, N, Vogel, S & Glass, J 2015, A complete KALDI recipe for building Arabic speech recognition systems. in 2014 IEEE Workshop on Spoken Language Technology, SLT 2014 - Proceedings., 7078629, Institute of Electrical and Electronics Engineers Inc., pp. 525-529, 2014 IEEE Workshop on Spoken Language Technology, SLT 2014, South Lake Tahoe, United States, 7/12/14. https://doi.org/10.1109/SLT.2014.7078629
Ali A, Zhang Y, Cardinal P, Dahak N, Vogel S, Glass J. A complete KALDI recipe for building Arabic speech recognition systems. In 2014 IEEE Workshop on Spoken Language Technology, SLT 2014 - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2015. p. 525-529. 7078629 https://doi.org/10.1109/SLT.2014.7078629
Ali, Ahmed ; Zhang, Yifan ; Cardinal, Patrick ; Dahak, Najim ; Vogel, Stephan ; Glass, James. / A complete KALDI recipe for building Arabic speech recognition systems. 2014 IEEE Workshop on Spoken Language Technology, SLT 2014 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2015. pp. 525-529
@inproceedings{76170053a0ec4ad5a7622a77777e9006,
title = "A complete KALDI recipe for building Arabic speech recognition systems",
abstract = "In this paper we present a recipe and language resources for training and testing Arabic speech recognition systems using the KALDI toolkit. We built a prototype broadcast news system using 200 hours GALE data that is publicly available through LDC. We describe in detail the decisions made in building the system: using the MADA toolkit for text normalization and vowelization; why we use 36 phonemes; how we generate pronunciations; how we build the language model. We report results using state-of-the-art modeling and decoding techniques. The scripts are released through KALDI and resources are made available on QCRI's language resources web portal. This is the first effort to share reproducible sizable training and testing results on MSA system.",
keywords = "Arabic, ASR system, GALE, KALDI, Lexicon",
author = "Ahmed Ali and Yifan Zhang and Patrick Cardinal and Najim Dahak and Stephan Vogel and James Glass",
year = "2015",
month = "4",
day = "1",
doi = "10.1109/SLT.2014.7078629",
language = "English",
isbn = "9781479971299",
pages = "525--529",
booktitle = "2014 IEEE Workshop on Spoken Language Technology, SLT 2014 - Proceedings",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - A complete KALDI recipe for building Arabic speech recognition systems

AU - Ali, Ahmed

AU - Zhang, Yifan

AU - Cardinal, Patrick

AU - Dahak, Najim

AU - Vogel, Stephan

AU - Glass, James

PY - 2015/4/1

Y1 - 2015/4/1

N2 - In this paper we present a recipe and language resources for training and testing Arabic speech recognition systems using the KALDI toolkit. We built a prototype broadcast news system using 200 hours GALE data that is publicly available through LDC. We describe in detail the decisions made in building the system: using the MADA toolkit for text normalization and vowelization; why we use 36 phonemes; how we generate pronunciations; how we build the language model. We report results using state-of-the-art modeling and decoding techniques. The scripts are released through KALDI and resources are made available on QCRI's language resources web portal. This is the first effort to share reproducible sizable training and testing results on MSA system.

AB - In this paper we present a recipe and language resources for training and testing Arabic speech recognition systems using the KALDI toolkit. We built a prototype broadcast news system using 200 hours GALE data that is publicly available through LDC. We describe in detail the decisions made in building the system: using the MADA toolkit for text normalization and vowelization; why we use 36 phonemes; how we generate pronunciations; how we build the language model. We report results using state-of-the-art modeling and decoding techniques. The scripts are released through KALDI and resources are made available on QCRI's language resources web portal. This is the first effort to share reproducible sizable training and testing results on MSA system.

KW - Arabic

KW - ASR system

KW - GALE

KW - KALDI

KW - Lexicon

UR - http://www.scopus.com/inward/record.url?scp=84928153597&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84928153597&partnerID=8YFLogxK

U2 - 10.1109/SLT.2014.7078629

DO - 10.1109/SLT.2014.7078629

M3 - Conference contribution

SN - 9781479971299

SP - 525

EP - 529

BT - 2014 IEEE Workshop on Spoken Language Technology, SLT 2014 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

ER -