Word segmentation through cross-lingual word-to-phoneme alignment

Felix Stahlberg, Tim Schlippe, Stephan Vogel, Tanja Schultz

Research output: Chapter in Book/Report/Conference proceedingConference contribution

14 Citations (Scopus)

Abstract

We present our new alignment model Model 3P for cross-lingual word-to-phoneme alignment, and show that unsupervised learning of word segmentation is more accurate when information of another language is used. Word segmentation with cross-lingual information is highly relevant to bootstrap pronunciation dictionaries from audio data for Automatic Speech Recognition, bypass the written form in Speech-to-Speech Translation or build the vocabulary of an unseen language, particularly in the context of under-resourced languages. Using Model 3P for the alignment between English words and Spanish phonemes outperforms a state-of-the-art monolingual word segmentation approach [1] on the BTEC corpus [2] by up to 42% absolute in F-Score on the phoneme level and a GIZA++ alignment based on IBM Model 3 by up to 17%.

Original languageEnglish
Title of host publication2012 IEEE Workshop on Spoken Language Technology, SLT 2012 - Proceedings
Pages85-90
Number of pages6
DOIs
Publication statusPublished - 1 Dec 2012
Event2012 IEEE Workshop on Spoken Language Technology, SLT 2012 - Miami, FL, United States
Duration: 2 Dec 20125 Dec 2012

Other

Other2012 IEEE Workshop on Spoken Language Technology, SLT 2012
CountryUnited States
CityMiami, FL
Period2/12/125/12/12

Fingerprint

language
dictionary
vocabulary
segmentation
Alignment
Word Segmentation
Phoneme
learning
Language
Bootstrap
Speech-to-speech Translation
English Words
Automatic Speech Recognition
English-Spanish
Unsupervised Learning
Vocabulary
Dictionary

Keywords

  • alignment model
  • speech-to-speech translation
  • under-resourced language
  • word segmentation

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Cite this

Stahlberg, F., Schlippe, T., Vogel, S., & Schultz, T. (2012). Word segmentation through cross-lingual word-to-phoneme alignment. In 2012 IEEE Workshop on Spoken Language Technology, SLT 2012 - Proceedings (pp. 85-90). [6424202] https://doi.org/10.1109/SLT.2012.6424202

Word segmentation through cross-lingual word-to-phoneme alignment. / Stahlberg, Felix; Schlippe, Tim; Vogel, Stephan; Schultz, Tanja.

2012 IEEE Workshop on Spoken Language Technology, SLT 2012 - Proceedings. 2012. p. 85-90 6424202.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Stahlberg, F, Schlippe, T, Vogel, S & Schultz, T 2012, Word segmentation through cross-lingual word-to-phoneme alignment. in 2012 IEEE Workshop on Spoken Language Technology, SLT 2012 - Proceedings., 6424202, pp. 85-90, 2012 IEEE Workshop on Spoken Language Technology, SLT 2012, Miami, FL, United States, 2/12/12. https://doi.org/10.1109/SLT.2012.6424202
Stahlberg F, Schlippe T, Vogel S, Schultz T. Word segmentation through cross-lingual word-to-phoneme alignment. In 2012 IEEE Workshop on Spoken Language Technology, SLT 2012 - Proceedings. 2012. p. 85-90. 6424202 https://doi.org/10.1109/SLT.2012.6424202
Stahlberg, Felix ; Schlippe, Tim ; Vogel, Stephan ; Schultz, Tanja. / Word segmentation through cross-lingual word-to-phoneme alignment. 2012 IEEE Workshop on Spoken Language Technology, SLT 2012 - Proceedings. 2012. pp. 85-90
@inproceedings{3e53fed9ef3945ee9d0501a4862b9d2f,
title = "Word segmentation through cross-lingual word-to-phoneme alignment",
abstract = "We present our new alignment model Model 3P for cross-lingual word-to-phoneme alignment, and show that unsupervised learning of word segmentation is more accurate when information of another language is used. Word segmentation with cross-lingual information is highly relevant to bootstrap pronunciation dictionaries from audio data for Automatic Speech Recognition, bypass the written form in Speech-to-Speech Translation or build the vocabulary of an unseen language, particularly in the context of under-resourced languages. Using Model 3P for the alignment between English words and Spanish phonemes outperforms a state-of-the-art monolingual word segmentation approach [1] on the BTEC corpus [2] by up to 42{\%} absolute in F-Score on the phoneme level and a GIZA++ alignment based on IBM Model 3 by up to 17{\%}.",
keywords = "alignment model, speech-to-speech translation, under-resourced language, word segmentation",
author = "Felix Stahlberg and Tim Schlippe and Stephan Vogel and Tanja Schultz",
year = "2012",
month = "12",
day = "1",
doi = "10.1109/SLT.2012.6424202",
language = "English",
isbn = "9781467351263",
pages = "85--90",
booktitle = "2012 IEEE Workshop on Spoken Language Technology, SLT 2012 - Proceedings",

}

TY - GEN

T1 - Word segmentation through cross-lingual word-to-phoneme alignment

AU - Stahlberg, Felix

AU - Schlippe, Tim

AU - Vogel, Stephan

AU - Schultz, Tanja

PY - 2012/12/1

Y1 - 2012/12/1

N2 - We present our new alignment model Model 3P for cross-lingual word-to-phoneme alignment, and show that unsupervised learning of word segmentation is more accurate when information of another language is used. Word segmentation with cross-lingual information is highly relevant to bootstrap pronunciation dictionaries from audio data for Automatic Speech Recognition, bypass the written form in Speech-to-Speech Translation or build the vocabulary of an unseen language, particularly in the context of under-resourced languages. Using Model 3P for the alignment between English words and Spanish phonemes outperforms a state-of-the-art monolingual word segmentation approach [1] on the BTEC corpus [2] by up to 42% absolute in F-Score on the phoneme level and a GIZA++ alignment based on IBM Model 3 by up to 17%.

AB - We present our new alignment model Model 3P for cross-lingual word-to-phoneme alignment, and show that unsupervised learning of word segmentation is more accurate when information of another language is used. Word segmentation with cross-lingual information is highly relevant to bootstrap pronunciation dictionaries from audio data for Automatic Speech Recognition, bypass the written form in Speech-to-Speech Translation or build the vocabulary of an unseen language, particularly in the context of under-resourced languages. Using Model 3P for the alignment between English words and Spanish phonemes outperforms a state-of-the-art monolingual word segmentation approach [1] on the BTEC corpus [2] by up to 42% absolute in F-Score on the phoneme level and a GIZA++ alignment based on IBM Model 3 by up to 17%.

KW - alignment model

KW - speech-to-speech translation

KW - under-resourced language

KW - word segmentation

UR - http://www.scopus.com/inward/record.url?scp=84874250689&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84874250689&partnerID=8YFLogxK

U2 - 10.1109/SLT.2012.6424202

DO - 10.1109/SLT.2012.6424202

M3 - Conference contribution

SN - 9781467351263

SP - 85

EP - 90

BT - 2012 IEEE Workshop on Spoken Language Technology, SLT 2012 - Proceedings

ER -