Improving speech synthesis of machine translation output

Alok Parlikar, Alan W. Black, Stephan Vogel

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Citations (Scopus)

Abstract

Speech synthesizers are optimized for fluent natural text. However, in a speech to speech translation system, they have to process machine translation output, which is often not fluent. Rendering machine translations as speech makes them even harder to understand than the synthesis of natural text. A speech synthesizer must deal with the disfluencies in translations in order to be comprehensible and communicate the content. In this paper, we explore three synthesis strategies that address different problems found in translation output. By carrying out listening tasks and measuring transcription accuracies, we find that these methods can make the synthesis of translations more intelligible.

Original languageEnglish
Title of host publicationProceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010
Pages194-197
Number of pages4
Publication statusPublished - 1 Dec 2010
Externally publishedYes
Event11th Annual Conference of the International Speech Communication Association: Spoken Language Processing for All, INTERSPEECH 2010 - Makuhari, Chiba, Japan
Duration: 26 Sep 201030 Sep 2010

Other

Other11th Annual Conference of the International Speech Communication Association: Spoken Language Processing for All, INTERSPEECH 2010
CountryJapan
CityMakuhari, Chiba
Period26/9/1030/9/10

Fingerprint

Communication Aids for Disabled
Speech Synthesis
Machine Translation
Synthesizer

Keywords

  • Machine translation disfluencies
  • Speech synthesis
  • Spoken language translation

ASJC Scopus subject areas

  • Language and Linguistics
  • Speech and Hearing

Cite this

Parlikar, A., Black, A. W., & Vogel, S. (2010). Improving speech synthesis of machine translation output. In Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010 (pp. 194-197)

Improving speech synthesis of machine translation output. / Parlikar, Alok; Black, Alan W.; Vogel, Stephan.

Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010. 2010. p. 194-197.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Parlikar, A, Black, AW & Vogel, S 2010, Improving speech synthesis of machine translation output. in Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010. pp. 194-197, 11th Annual Conference of the International Speech Communication Association: Spoken Language Processing for All, INTERSPEECH 2010, Makuhari, Chiba, Japan, 26/9/10.
Parlikar A, Black AW, Vogel S. Improving speech synthesis of machine translation output. In Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010. 2010. p. 194-197
Parlikar, Alok ; Black, Alan W. ; Vogel, Stephan. / Improving speech synthesis of machine translation output. Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010. 2010. pp. 194-197
@inproceedings{3120845945b44cf8994196c8b3d30eb7,
title = "Improving speech synthesis of machine translation output",
abstract = "Speech synthesizers are optimized for fluent natural text. However, in a speech to speech translation system, they have to process machine translation output, which is often not fluent. Rendering machine translations as speech makes them even harder to understand than the synthesis of natural text. A speech synthesizer must deal with the disfluencies in translations in order to be comprehensible and communicate the content. In this paper, we explore three synthesis strategies that address different problems found in translation output. By carrying out listening tasks and measuring transcription accuracies, we find that these methods can make the synthesis of translations more intelligible.",
keywords = "Machine translation disfluencies, Speech synthesis, Spoken language translation",
author = "Alok Parlikar and Black, {Alan W.} and Stephan Vogel",
year = "2010",
month = "12",
day = "1",
language = "English",
pages = "194--197",
booktitle = "Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010",

}

TY - GEN

T1 - Improving speech synthesis of machine translation output

AU - Parlikar, Alok

AU - Black, Alan W.

AU - Vogel, Stephan

PY - 2010/12/1

Y1 - 2010/12/1

N2 - Speech synthesizers are optimized for fluent natural text. However, in a speech to speech translation system, they have to process machine translation output, which is often not fluent. Rendering machine translations as speech makes them even harder to understand than the synthesis of natural text. A speech synthesizer must deal with the disfluencies in translations in order to be comprehensible and communicate the content. In this paper, we explore three synthesis strategies that address different problems found in translation output. By carrying out listening tasks and measuring transcription accuracies, we find that these methods can make the synthesis of translations more intelligible.

AB - Speech synthesizers are optimized for fluent natural text. However, in a speech to speech translation system, they have to process machine translation output, which is often not fluent. Rendering machine translations as speech makes them even harder to understand than the synthesis of natural text. A speech synthesizer must deal with the disfluencies in translations in order to be comprehensible and communicate the content. In this paper, we explore three synthesis strategies that address different problems found in translation output. By carrying out listening tasks and measuring transcription accuracies, we find that these methods can make the synthesis of translations more intelligible.

KW - Machine translation disfluencies

KW - Speech synthesis

KW - Spoken language translation

UR - http://www.scopus.com/inward/record.url?scp=79959824887&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79959824887&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:79959824887

SP - 194

EP - 197

BT - Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010

ER -