On the quality of lexical resources for word sense disambiguation

Lluís Màrquez, Mariona Taulé, Lluís Padró, Luis Villarejo, Maria Antònia Martí

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Word Sense Disambiguation (WSD) systems are usually evaluated by comparing their absolute performance, in a fixed experimental setting, to other alternative algorithms and methods. However, little attention has been paid to analyze the lexical resources and the corpora defining the experimental settings and their possible interactions with the overall results obtained. In this paper we present some experiments supporting the hypothesis that the quality of lexical resources used for tagging the training corpora of WSD systems partly determines the quality of the results. In order to verify this initial hypothesis we have developed two kinds of experiments. At the linguistic level, we have tested the quality of lexical resources in terms of the annotators' agreement degree. From the computational point of view, we have evaluated how those different lexical resources affect the accuracy of the resulting WSD classifiers. We have carried out these experiments using three different lexical resources as sense inventories and a fixed WSD system based on Support Vector Machines.

Original languageEnglish
Title of host publicationAdvances in Natural Languag Processing - 4th International Conference, EstAL 2004
EditorsJosé Luis Vicedo, Patricio Martínez-Barco, Rafael Muñoz, Maximiliano Saiz-Noeda
PublisherSpringer Verlag
Pages291-302
Number of pages12
ISBN (Electronic)9783540234982
Publication statusPublished - 1 Jan 2004
Event4th International Conference on Espana for Natural Language Processing, EsTAL 2004 - Alicante, Spain
Duration: 20 Oct 200422 Oct 2004

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3230
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference4th International Conference on Espana for Natural Language Processing, EsTAL 2004
CountrySpain
CityAlicante
Period20/10/0422/10/04

    Fingerprint

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Màrquez, L., Taulé, M., Padró, L., Villarejo, L., & Martí, M. A. (2004). On the quality of lexical resources for word sense disambiguation. In J. L. Vicedo, P. Martínez-Barco, R. Muñoz, & M. Saiz-Noeda (Eds.), Advances in Natural Languag Processing - 4th International Conference, EstAL 2004 (pp. 291-302). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 3230). Springer Verlag.