Farspeech: Arabic natural language processing for live Arabic speech

Mohamed Eldesouki, Naassih Gopee, Ahmed Ali, Kareem Darwish

Research output: Contribution to journalConference article

Abstract

This paper presents FarSpeech, QCRI's combined Arabic speech recognition, natural language processing (NLP), and dialect identification pipeline. It features modern web technologies to capture live audio, transcribes Arabic audio, NLP processes the transcripts, and identifies the dialect of the speaker. For transcription, we use QATS, which is a Kaldi-based ASR system that uses Time Delay Neural Networks (TDNN). For NLP, we use a SOTA Arabic NLP toolkit that employs various deep neural network and SVM based models. Finally, our dialect identification system uses multi-modality from both acoustic and linguistic input. FarSpeech1 presents different screens to display the transcripts, text segmentation, part-of-speech tags, recognized named entities, diacritized text, and the identified dialect of the speech.

Original languageEnglish
Pages (from-to)2372-2373
Number of pages2
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume2019-September
DOIs
Publication statusPublished - 1 Jan 2019
Event20th Annual Conference of the International Speech Communication Association: Crossroads of Speech and Language, INTERSPEECH 2019 - Graz, Austria
Duration: 15 Sep 201919 Sep 2019

Fingerprint

Natural Language
Processing
Neural Networks
Multimodality
Transcription
Speech Recognition
System Identification
Speech recognition
Linguistics
Time Delay
Time delay
Identification (control systems)
Acoustics
Segmentation
Pipelines
Neural networks
Speech
Natural Language Processing
Text
Model

Keywords

  • Live speech recognition
  • Natural Language Processing
  • Speech Transcription

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modelling and Simulation

Cite this

Farspeech : Arabic natural language processing for live Arabic speech. / Eldesouki, Mohamed; Gopee, Naassih; Ali, Ahmed; Darwish, Kareem.

In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, Vol. 2019-September, 01.01.2019, p. 2372-2373.

Research output: Contribution to journalConference article

@article{fb1294a85b534680ae286d779bb6dd4f,
title = "Farspeech: Arabic natural language processing for live Arabic speech",
abstract = "This paper presents FarSpeech, QCRI's combined Arabic speech recognition, natural language processing (NLP), and dialect identification pipeline. It features modern web technologies to capture live audio, transcribes Arabic audio, NLP processes the transcripts, and identifies the dialect of the speaker. For transcription, we use QATS, which is a Kaldi-based ASR system that uses Time Delay Neural Networks (TDNN). For NLP, we use a SOTA Arabic NLP toolkit that employs various deep neural network and SVM based models. Finally, our dialect identification system uses multi-modality from both acoustic and linguistic input. FarSpeech1 presents different screens to display the transcripts, text segmentation, part-of-speech tags, recognized named entities, diacritized text, and the identified dialect of the speech.",
keywords = "Live speech recognition, Natural Language Processing, Speech Transcription",
author = "Mohamed Eldesouki and Naassih Gopee and Ahmed Ali and Kareem Darwish",
year = "2019",
month = "1",
day = "1",
doi = "10.21437/Interspeech.2019-8030",
language = "English",
volume = "2019-September",
pages = "2372--2373",
journal = "Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH",
issn = "2308-457X",

}

TY - JOUR

T1 - Farspeech

T2 - Arabic natural language processing for live Arabic speech

AU - Eldesouki, Mohamed

AU - Gopee, Naassih

AU - Ali, Ahmed

AU - Darwish, Kareem

PY - 2019/1/1

Y1 - 2019/1/1

N2 - This paper presents FarSpeech, QCRI's combined Arabic speech recognition, natural language processing (NLP), and dialect identification pipeline. It features modern web technologies to capture live audio, transcribes Arabic audio, NLP processes the transcripts, and identifies the dialect of the speaker. For transcription, we use QATS, which is a Kaldi-based ASR system that uses Time Delay Neural Networks (TDNN). For NLP, we use a SOTA Arabic NLP toolkit that employs various deep neural network and SVM based models. Finally, our dialect identification system uses multi-modality from both acoustic and linguistic input. FarSpeech1 presents different screens to display the transcripts, text segmentation, part-of-speech tags, recognized named entities, diacritized text, and the identified dialect of the speech.

AB - This paper presents FarSpeech, QCRI's combined Arabic speech recognition, natural language processing (NLP), and dialect identification pipeline. It features modern web technologies to capture live audio, transcribes Arabic audio, NLP processes the transcripts, and identifies the dialect of the speaker. For transcription, we use QATS, which is a Kaldi-based ASR system that uses Time Delay Neural Networks (TDNN). For NLP, we use a SOTA Arabic NLP toolkit that employs various deep neural network and SVM based models. Finally, our dialect identification system uses multi-modality from both acoustic and linguistic input. FarSpeech1 presents different screens to display the transcripts, text segmentation, part-of-speech tags, recognized named entities, diacritized text, and the identified dialect of the speech.

KW - Live speech recognition

KW - Natural Language Processing

KW - Speech Transcription

UR - http://www.scopus.com/inward/record.url?scp=85074723974&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85074723974&partnerID=8YFLogxK

U2 - 10.21437/Interspeech.2019-8030

DO - 10.21437/Interspeech.2019-8030

M3 - Conference article

AN - SCOPUS:85074723974

VL - 2019-September

SP - 2372

EP - 2373

JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

SN - 2308-457X

ER -