Pronunciation extraction from phoneme sequences through cross-lingual word-to-phoneme alignment

Felix Stahlberg, Tim Schlippe, Stephan Vogel, Tanja Schultz

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

With the help of written translations in a source language, we cross-lingually segment phoneme sequences in a target language into word units using our new alignment model Model 3P [17]. From this, we deduce phonetic transcriptions of target language words, introduce the vocabulary in terms of word IDs, and extract a pronunciation dictionary. Our approach is highly relevant to bootstrap dictionaries from audio data for Automatic Speech Recognition and bypass the written form in Speech-to-Speech Translation, particularly in the context of under-resourced languages, and those which are not written at all. Analyzing 14 translations in 9 languages to build a dictionary for English shows that the quality of the resulting dictionary is better in case of close vocabulary sizes in source and target language, shorter sentences, more word repetitions, and formal equivalent translations.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages260-272
Number of pages13
Volume7978 LNAI
DOIs
Publication statusPublished - 3 Sep 2013
Event1st International Conference on Statistical Language and Speech Processing, SLSP 2013 - Tarragona, Spain
Duration: 29 Jul 201331 Jul 2013

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7978 LNAI
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other1st International Conference on Statistical Language and Speech Processing, SLSP 2013
CountrySpain
CityTarragona
Period29/7/1331/7/13

    Fingerprint

Keywords

  • pronunciation dictionary
  • speech-to-speech translation
  • under-resourced languages
  • word segmentation

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Stahlberg, F., Schlippe, T., Vogel, S., & Schultz, T. (2013). Pronunciation extraction from phoneme sequences through cross-lingual word-to-phoneme alignment. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7978 LNAI, pp. 260-272). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7978 LNAI). https://doi.org/10.1007/978-3-642-39593-2_23