The architecture and the implementation of a finite state pronunciation lexicon for Turkish

Kemal Oflazer, Sharon Inkelas

Research output: Contribution to journalArticle

16 Citations (Scopus)

Abstract

This paper describes the architecture and the implementation of a full-scale pronunciation lexicon for Turkish using finite state technology. The system produces at its output, a parallel representation of the pronunciation and the morphological analysis of the word form so that further disambiguation processes can be used to disambiguate pronunciation. The pronunciation representation is based on the SAMPA standard and also encodes the position of the primary stress. The computation of the position of the primary stress depends on an interplay of any exceptional stress in root words and stress properties of certain morphemes, and requires that a full morphological analysis be done. The system has been implemented using XRCE Finite State Toolkit.

Original languageEnglish
Pages (from-to)80-106
Number of pages27
JournalComputer Speech and Language
Volume20
Issue number1
DOIs
Publication statusPublished - Jan 2006
Externally publishedYes

Fingerprint

Morphological Analysis
Technology
Roots
Output
Architecture
Standards
Form

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering
  • Experimental and Cognitive Psychology
  • Linguistics and Language

Cite this

The architecture and the implementation of a finite state pronunciation lexicon for Turkish. / Oflazer, Kemal; Inkelas, Sharon.

In: Computer Speech and Language, Vol. 20, No. 1, 01.2006, p. 80-106.

Research output: Contribution to journalArticle

@article{936c22e5254a42538a7caaf2f17373c1,
title = "The architecture and the implementation of a finite state pronunciation lexicon for Turkish",
abstract = "This paper describes the architecture and the implementation of a full-scale pronunciation lexicon for Turkish using finite state technology. The system produces at its output, a parallel representation of the pronunciation and the morphological analysis of the word form so that further disambiguation processes can be used to disambiguate pronunciation. The pronunciation representation is based on the SAMPA standard and also encodes the position of the primary stress. The computation of the position of the primary stress depends on an interplay of any exceptional stress in root words and stress properties of certain morphemes, and requires that a full morphological analysis be done. The system has been implemented using XRCE Finite State Toolkit.",
author = "Kemal Oflazer and Sharon Inkelas",
year = "2006",
month = "1",
doi = "10.1016/j.csl.2005.01.002",
language = "English",
volume = "20",
pages = "80--106",
journal = "Computer Speech and Language",
issn = "0885-2308",
publisher = "Academic Press Inc.",
number = "1",

}

TY - JOUR

T1 - The architecture and the implementation of a finite state pronunciation lexicon for Turkish

AU - Oflazer, Kemal

AU - Inkelas, Sharon

PY - 2006/1

Y1 - 2006/1

N2 - This paper describes the architecture and the implementation of a full-scale pronunciation lexicon for Turkish using finite state technology. The system produces at its output, a parallel representation of the pronunciation and the morphological analysis of the word form so that further disambiguation processes can be used to disambiguate pronunciation. The pronunciation representation is based on the SAMPA standard and also encodes the position of the primary stress. The computation of the position of the primary stress depends on an interplay of any exceptional stress in root words and stress properties of certain morphemes, and requires that a full morphological analysis be done. The system has been implemented using XRCE Finite State Toolkit.

AB - This paper describes the architecture and the implementation of a full-scale pronunciation lexicon for Turkish using finite state technology. The system produces at its output, a parallel representation of the pronunciation and the morphological analysis of the word form so that further disambiguation processes can be used to disambiguate pronunciation. The pronunciation representation is based on the SAMPA standard and also encodes the position of the primary stress. The computation of the position of the primary stress depends on an interplay of any exceptional stress in root words and stress properties of certain morphemes, and requires that a full morphological analysis be done. The system has been implemented using XRCE Finite State Toolkit.

UR - http://www.scopus.com/inward/record.url?scp=27744599256&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=27744599256&partnerID=8YFLogxK

U2 - 10.1016/j.csl.2005.01.002

DO - 10.1016/j.csl.2005.01.002

M3 - Article

AN - SCOPUS:27744599256

VL - 20

SP - 80

EP - 106

JO - Computer Speech and Language

JF - Computer Speech and Language

SN - 0885-2308

IS - 1

ER -