Recent advances in ASR Applied to an Arabic transcription system for Al-Jazeera

Patrick Cardinal, Ahmed Ali, Najim Dehak, Yu Zhang, Tuka Al Hanai, Yifan Zhang, James Glass, Stephan Vogel

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Citations (Scopus)

Abstract

This paper describes a detailed comparison of several state-of-the-art speech recognition techniques applied to a limited Arabic broadcast news dataset. The different approaches were all trained on 50 hours of transcribed audio from the Al-Jazeera news channel. The best results were obtained using i-vector-based speaker adaptation in a training scenario using the Minimum Phone Error (MPE) criteria combined with sequential Deep Neural Network (DNN) training. We report results for two different types of test data: broadcast news reports, with a best word error rate (WER) of 17.86%, and a broadcast conversations with a best WER of 29.85%. The overall WER on this test set is 25.6%.

Original languageEnglish
Title of host publicationProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
PublisherInternational Speech and Communication Association
Pages2088-2092
Number of pages5
Publication statusPublished - 2014
Event15th Annual Conference of the International Speech Communication Association: Celebrating the Diversity of Spoken Languages, INTERSPEECH 2014 - Singapore, Singapore
Duration: 14 Sep 201418 Sep 2014

Other

Other15th Annual Conference of the International Speech Communication Association: Celebrating the Diversity of Spoken Languages, INTERSPEECH 2014
CountrySingapore
CitySingapore
Period14/9/1418/9/14

    Fingerprint

Keywords

  • Arabic
  • ASR system
  • Kaldi

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modelling and Simulation

Cite this

Cardinal, P., Ali, A., Dehak, N., Zhang, Y., Al Hanai, T., Zhang, Y., Glass, J., & Vogel, S. (2014). Recent advances in ASR Applied to an Arabic transcription system for Al-Jazeera. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp. 2088-2092). International Speech and Communication Association.