The incremental use of morphological information and lexicalization in data-driven dependency parsing

Gülşen Eryiǧit, Joakim Nivre, Kemal Oflazer

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

Typological diversity among the natural languages of the world poses interesting challenges for the models and algorithms used in syntactic parsing. In this paper, we apply a data-driven dependency parser to Turkish, a language characterized by rich morphology and flexible constituent order, and study the effect of employing varying amounts of morpholexical information on parsing performance. The investigations show that accuracy can be improved by using representations based on inflectional groups rather than word forms, confirming earlier studies. In addition, lexicalization and the use of rich morphological features are found to have a positive effect. By combining all these techniques, we obtain the highest reported accuracy for parsing the Turkish Treebank.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages498-508
Number of pages11
Volume4285 LNAI
DOIs
Publication statusPublished - 2006
Externally publishedYes
Event21st International Conference on Computer Processing of Oriental Languages: Beyond the Orient: The Research Challenges Ahead, ICCPOL 2006 - Singapore, Singapore
Duration: 17 Dec 200619 Dec 2006

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4285 LNAI
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other21st International Conference on Computer Processing of Oriental Languages: Beyond the Orient: The Research Challenges Ahead, ICCPOL 2006
CountrySingapore
CitySingapore
Period17/12/0619/12/06

Fingerprint

Parsing
Syntactics
Data-driven
Natural Language
High Accuracy
Model

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Eryiǧit, G., Nivre, J., & Oflazer, K. (2006). The incremental use of morphological information and lexicalization in data-driven dependency parsing. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4285 LNAI, pp. 498-508). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4285 LNAI). https://doi.org/10.1007/11940098_53

The incremental use of morphological information and lexicalization in data-driven dependency parsing. / Eryiǧit, Gülşen; Nivre, Joakim; Oflazer, Kemal.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4285 LNAI 2006. p. 498-508 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4285 LNAI).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Eryiǧit, G, Nivre, J & Oflazer, K 2006, The incremental use of morphological information and lexicalization in data-driven dependency parsing. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 4285 LNAI, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 4285 LNAI, pp. 498-508, 21st International Conference on Computer Processing of Oriental Languages: Beyond the Orient: The Research Challenges Ahead, ICCPOL 2006, Singapore, Singapore, 17/12/06. https://doi.org/10.1007/11940098_53
Eryiǧit G, Nivre J, Oflazer K. The incremental use of morphological information and lexicalization in data-driven dependency parsing. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4285 LNAI. 2006. p. 498-508. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/11940098_53
Eryiǧit, Gülşen ; Nivre, Joakim ; Oflazer, Kemal. / The incremental use of morphological information and lexicalization in data-driven dependency parsing. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 4285 LNAI 2006. pp. 498-508 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{e3b4a540ce7c4ec2b6083e74bef9b73b,
title = "The incremental use of morphological information and lexicalization in data-driven dependency parsing",
abstract = "Typological diversity among the natural languages of the world poses interesting challenges for the models and algorithms used in syntactic parsing. In this paper, we apply a data-driven dependency parser to Turkish, a language characterized by rich morphology and flexible constituent order, and study the effect of employing varying amounts of morpholexical information on parsing performance. The investigations show that accuracy can be improved by using representations based on inflectional groups rather than word forms, confirming earlier studies. In addition, lexicalization and the use of rich morphological features are found to have a positive effect. By combining all these techniques, we obtain the highest reported accuracy for parsing the Turkish Treebank.",
author = "G{\"u}lşen Eryiǧit and Joakim Nivre and Kemal Oflazer",
year = "2006",
doi = "10.1007/11940098_53",
language = "English",
isbn = "354049667X",
volume = "4285 LNAI",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "498--508",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - The incremental use of morphological information and lexicalization in data-driven dependency parsing

AU - Eryiǧit, Gülşen

AU - Nivre, Joakim

AU - Oflazer, Kemal

PY - 2006

Y1 - 2006

N2 - Typological diversity among the natural languages of the world poses interesting challenges for the models and algorithms used in syntactic parsing. In this paper, we apply a data-driven dependency parser to Turkish, a language characterized by rich morphology and flexible constituent order, and study the effect of employing varying amounts of morpholexical information on parsing performance. The investigations show that accuracy can be improved by using representations based on inflectional groups rather than word forms, confirming earlier studies. In addition, lexicalization and the use of rich morphological features are found to have a positive effect. By combining all these techniques, we obtain the highest reported accuracy for parsing the Turkish Treebank.

AB - Typological diversity among the natural languages of the world poses interesting challenges for the models and algorithms used in syntactic parsing. In this paper, we apply a data-driven dependency parser to Turkish, a language characterized by rich morphology and flexible constituent order, and study the effect of employing varying amounts of morpholexical information on parsing performance. The investigations show that accuracy can be improved by using representations based on inflectional groups rather than word forms, confirming earlier studies. In addition, lexicalization and the use of rich morphological features are found to have a positive effect. By combining all these techniques, we obtain the highest reported accuracy for parsing the Turkish Treebank.

UR - http://www.scopus.com/inward/record.url?scp=77049122042&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77049122042&partnerID=8YFLogxK

U2 - 10.1007/11940098_53

DO - 10.1007/11940098_53

M3 - Conference contribution

AN - SCOPUS:77049122042

SN - 354049667X

SN - 9783540496670

VL - 4285 LNAI

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 498

EP - 508

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -