Feature-rich part-of-speech tagging for morphologically complex languages

Application to bulgarian

Georgi Georgiev, Valentin Zhikov, Petya Osenova, Kiril Simov, Preslav Nakov

Research output: Chapter in Book/Report/Conference proceedingConference contribution

12 Citations (Scopus)

Abstract

We present experiments with part-ofspeech tagging for Bulgarian, a Slavic language with rich inflectional and derivational morphology. Unlike most previous work, which has used a small number of grammatical categories, we work with 680 morpho-syntactic tags. We combine a large morphological lexicon with prior linguistic knowledge and guided learning from a POS-Annotated corpus, achieving accuracy of 97.98%, which is a significant improvement over the state-of-The-Art for Bulgarian.

Original languageEnglish
Title of host publicationEACL 2012 - 13th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings
PublisherAssociation for Computational Linguistics (ACL)
Pages492-502
Number of pages11
ISBN (Electronic)9781937284190
Publication statusPublished - 1 Jan 2012
Event13th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2012 - Avignon, France
Duration: 23 Apr 201227 Apr 2012

Other

Other13th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2012
CountryFrance
CityAvignon
Period23/4/1227/4/12

Fingerprint

Syntactics
Linguistics
Experiments

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Software

Cite this

Georgiev, G., Zhikov, V., Osenova, P., Simov, K., & Nakov, P. (2012). Feature-rich part-of-speech tagging for morphologically complex languages: Application to bulgarian. In EACL 2012 - 13th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings (pp. 492-502). Association for Computational Linguistics (ACL).

Feature-rich part-of-speech tagging for morphologically complex languages : Application to bulgarian. / Georgiev, Georgi; Zhikov, Valentin; Osenova, Petya; Simov, Kiril; Nakov, Preslav.

EACL 2012 - 13th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings. Association for Computational Linguistics (ACL), 2012. p. 492-502.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Georgiev, G, Zhikov, V, Osenova, P, Simov, K & Nakov, P 2012, Feature-rich part-of-speech tagging for morphologically complex languages: Application to bulgarian. in EACL 2012 - 13th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings. Association for Computational Linguistics (ACL), pp. 492-502, 13th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2012, Avignon, France, 23/4/12.
Georgiev G, Zhikov V, Osenova P, Simov K, Nakov P. Feature-rich part-of-speech tagging for morphologically complex languages: Application to bulgarian. In EACL 2012 - 13th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings. Association for Computational Linguistics (ACL). 2012. p. 492-502
Georgiev, Georgi ; Zhikov, Valentin ; Osenova, Petya ; Simov, Kiril ; Nakov, Preslav. / Feature-rich part-of-speech tagging for morphologically complex languages : Application to bulgarian. EACL 2012 - 13th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings. Association for Computational Linguistics (ACL), 2012. pp. 492-502
@inproceedings{58ca0ec897934418b79d820087b230a7,
title = "Feature-rich part-of-speech tagging for morphologically complex languages: Application to bulgarian",
abstract = "We present experiments with part-ofspeech tagging for Bulgarian, a Slavic language with rich inflectional and derivational morphology. Unlike most previous work, which has used a small number of grammatical categories, we work with 680 morpho-syntactic tags. We combine a large morphological lexicon with prior linguistic knowledge and guided learning from a POS-Annotated corpus, achieving accuracy of 97.98{\%}, which is a significant improvement over the state-of-The-Art for Bulgarian.",
author = "Georgi Georgiev and Valentin Zhikov and Petya Osenova and Kiril Simov and Preslav Nakov",
year = "2012",
month = "1",
day = "1",
language = "English",
pages = "492--502",
booktitle = "EACL 2012 - 13th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings",
publisher = "Association for Computational Linguistics (ACL)",

}

TY - GEN

T1 - Feature-rich part-of-speech tagging for morphologically complex languages

T2 - Application to bulgarian

AU - Georgiev, Georgi

AU - Zhikov, Valentin

AU - Osenova, Petya

AU - Simov, Kiril

AU - Nakov, Preslav

PY - 2012/1/1

Y1 - 2012/1/1

N2 - We present experiments with part-ofspeech tagging for Bulgarian, a Slavic language with rich inflectional and derivational morphology. Unlike most previous work, which has used a small number of grammatical categories, we work with 680 morpho-syntactic tags. We combine a large morphological lexicon with prior linguistic knowledge and guided learning from a POS-Annotated corpus, achieving accuracy of 97.98%, which is a significant improvement over the state-of-The-Art for Bulgarian.

AB - We present experiments with part-ofspeech tagging for Bulgarian, a Slavic language with rich inflectional and derivational morphology. Unlike most previous work, which has used a small number of grammatical categories, we work with 680 morpho-syntactic tags. We combine a large morphological lexicon with prior linguistic knowledge and guided learning from a POS-Annotated corpus, achieving accuracy of 97.98%, which is a significant improvement over the state-of-The-Art for Bulgarian.

UR - http://www.scopus.com/inward/record.url?scp=84940844979&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84940844979&partnerID=8YFLogxK

M3 - Conference contribution

SP - 492

EP - 502

BT - EACL 2012 - 13th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings

PB - Association for Computational Linguistics (ACL)

ER -