Feature-rich part-of-speech tagging for morphologically complex languages: Application to bulgarian

Georgi Georgiev, Valentin Zhikov, Petya Osenova, Kiril Simov, Preslav Nakov

Research output: Chapter in Book/Report/Conference proceedingConference contribution

13 Citations (Scopus)

Abstract

We present experiments with part-ofspeech tagging for Bulgarian, a Slavic language with rich inflectional and derivational morphology. Unlike most previous work, which has used a small number of grammatical categories, we work with 680 morpho-syntactic tags. We combine a large morphological lexicon with prior linguistic knowledge and guided learning from a POS-Annotated corpus, achieving accuracy of 97.98%, which is a significant improvement over the state-of-The-Art for Bulgarian.

Original languageEnglish
Title of host publicationEACL 2012 - 13th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings
PublisherAssociation for Computational Linguistics (ACL)
Pages492-502
Number of pages11
ISBN (Electronic)9781937284190
Publication statusPublished - 1 Jan 2012
Event13th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2012 - Avignon, France
Duration: 23 Apr 201227 Apr 2012

Other

Other13th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2012
CountryFrance
CityAvignon
Period23/4/1227/4/12

    Fingerprint

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Software

Cite this

Georgiev, G., Zhikov, V., Osenova, P., Simov, K., & Nakov, P. (2012). Feature-rich part-of-speech tagging for morphologically complex languages: Application to bulgarian. In EACL 2012 - 13th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings (pp. 492-502). Association for Computational Linguistics (ACL).