MIT-QCRI Arabic dialect identification system for the 2017 multi-genre broadcast challenge

Suwon Shon, Ahmed Ali, James Glass

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

In order to successfully annotate the Arabic speech content found in open-domain media broadcasts, it is essential to be able to process a diverse set of Arabic dialects. For the 2017 Multi-Genre Broadcast challenge (MGB-3) there were two possible tasks: Arabic speech recognition, and Arabic Dialect Identification (ADI). In this paper, we describe our efforts to create an ADI system for the MGB-3 challenge, with the goal of distinguishing amongst four major Arabic dialects, as well as Modern Standard Arabic. Our research focused on dialect variability and domain mismatches between the training and test domain. In order to achieve a robust ADI system, we explored both Siamese neural network models to learn similarity and dissimilarities among Arabic dialects, as well as i-vector post-processing to adapt domain mismatches. Both Acoustic and linguistic features were used for the final MGB-3 submissions, with the best primary system achieving 75% accuracy on the official 10hr test set.

Original languageEnglish
Title of host publication2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages374-380
Number of pages7
Volume2018-January
ISBN (Electronic)9781509047888
DOIs
Publication statusPublished - 24 Jan 2018
Event2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017 - Okinawa, Japan
Duration: 16 Dec 201720 Dec 2017

Other

Other2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017
CountryJapan
CityOkinawa
Period16/12/1720/12/17

Fingerprint

Identification (control systems)
Speech recognition
Linguistics
Acoustics
Neural networks
Processing

Keywords

  • Arabic
  • Dialect Recognition
  • Domain Adaptation
  • MGB challenge
  • Siamese Network

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition
  • Human-Computer Interaction

Cite this

Shon, S., Ali, A., & Glass, J. (2018). MIT-QCRI Arabic dialect identification system for the 2017 multi-genre broadcast challenge. In 2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017 - Proceedings (Vol. 2018-January, pp. 374-380). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ASRU.2017.8268960

MIT-QCRI Arabic dialect identification system for the 2017 multi-genre broadcast challenge. / Shon, Suwon; Ali, Ahmed; Glass, James.

2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017 - Proceedings. Vol. 2018-January Institute of Electrical and Electronics Engineers Inc., 2018. p. 374-380.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Shon, S, Ali, A & Glass, J 2018, MIT-QCRI Arabic dialect identification system for the 2017 multi-genre broadcast challenge. in 2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017 - Proceedings. vol. 2018-January, Institute of Electrical and Electronics Engineers Inc., pp. 374-380, 2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017, Okinawa, Japan, 16/12/17. https://doi.org/10.1109/ASRU.2017.8268960
Shon S, Ali A, Glass J. MIT-QCRI Arabic dialect identification system for the 2017 multi-genre broadcast challenge. In 2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017 - Proceedings. Vol. 2018-January. Institute of Electrical and Electronics Engineers Inc. 2018. p. 374-380 https://doi.org/10.1109/ASRU.2017.8268960
Shon, Suwon ; Ali, Ahmed ; Glass, James. / MIT-QCRI Arabic dialect identification system for the 2017 multi-genre broadcast challenge. 2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017 - Proceedings. Vol. 2018-January Institute of Electrical and Electronics Engineers Inc., 2018. pp. 374-380
@inproceedings{26aa7bf4f8a7405cb64895f7e6851b25,
title = "MIT-QCRI Arabic dialect identification system for the 2017 multi-genre broadcast challenge",
abstract = "In order to successfully annotate the Arabic speech content found in open-domain media broadcasts, it is essential to be able to process a diverse set of Arabic dialects. For the 2017 Multi-Genre Broadcast challenge (MGB-3) there were two possible tasks: Arabic speech recognition, and Arabic Dialect Identification (ADI). In this paper, we describe our efforts to create an ADI system for the MGB-3 challenge, with the goal of distinguishing amongst four major Arabic dialects, as well as Modern Standard Arabic. Our research focused on dialect variability and domain mismatches between the training and test domain. In order to achieve a robust ADI system, we explored both Siamese neural network models to learn similarity and dissimilarities among Arabic dialects, as well as i-vector post-processing to adapt domain mismatches. Both Acoustic and linguistic features were used for the final MGB-3 submissions, with the best primary system achieving 75{\%} accuracy on the official 10hr test set.",
keywords = "Arabic, Dialect Recognition, Domain Adaptation, MGB challenge, Siamese Network",
author = "Suwon Shon and Ahmed Ali and James Glass",
year = "2018",
month = "1",
day = "24",
doi = "10.1109/ASRU.2017.8268960",
language = "English",
volume = "2018-January",
pages = "374--380",
booktitle = "2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017 - Proceedings",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - MIT-QCRI Arabic dialect identification system for the 2017 multi-genre broadcast challenge

AU - Shon, Suwon

AU - Ali, Ahmed

AU - Glass, James

PY - 2018/1/24

Y1 - 2018/1/24

N2 - In order to successfully annotate the Arabic speech content found in open-domain media broadcasts, it is essential to be able to process a diverse set of Arabic dialects. For the 2017 Multi-Genre Broadcast challenge (MGB-3) there were two possible tasks: Arabic speech recognition, and Arabic Dialect Identification (ADI). In this paper, we describe our efforts to create an ADI system for the MGB-3 challenge, with the goal of distinguishing amongst four major Arabic dialects, as well as Modern Standard Arabic. Our research focused on dialect variability and domain mismatches between the training and test domain. In order to achieve a robust ADI system, we explored both Siamese neural network models to learn similarity and dissimilarities among Arabic dialects, as well as i-vector post-processing to adapt domain mismatches. Both Acoustic and linguistic features were used for the final MGB-3 submissions, with the best primary system achieving 75% accuracy on the official 10hr test set.

AB - In order to successfully annotate the Arabic speech content found in open-domain media broadcasts, it is essential to be able to process a diverse set of Arabic dialects. For the 2017 Multi-Genre Broadcast challenge (MGB-3) there were two possible tasks: Arabic speech recognition, and Arabic Dialect Identification (ADI). In this paper, we describe our efforts to create an ADI system for the MGB-3 challenge, with the goal of distinguishing amongst four major Arabic dialects, as well as Modern Standard Arabic. Our research focused on dialect variability and domain mismatches between the training and test domain. In order to achieve a robust ADI system, we explored both Siamese neural network models to learn similarity and dissimilarities among Arabic dialects, as well as i-vector post-processing to adapt domain mismatches. Both Acoustic and linguistic features were used for the final MGB-3 submissions, with the best primary system achieving 75% accuracy on the official 10hr test set.

KW - Arabic

KW - Dialect Recognition

KW - Domain Adaptation

KW - MGB challenge

KW - Siamese Network

UR - http://www.scopus.com/inward/record.url?scp=85050559305&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85050559305&partnerID=8YFLogxK

U2 - 10.1109/ASRU.2017.8268960

DO - 10.1109/ASRU.2017.8268960

M3 - Conference contribution

AN - SCOPUS:85050559305

VL - 2018-January

SP - 374

EP - 380

BT - 2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

ER -