On the quality of lexical resources for word sense disambiguation

Lluis Marques, Mariona Taulé, Lluís Padró, Luis Villarejo, Maria Antònia Martí

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Word Sense Disambiguation (WSD) systems are usually evaluated by comparing their absolute performance, in a fixed experimental setting, to other alternative algorithms and methods. However, little attention has been paid to analyze the lexical resources and the corpora defining the experimental settings and their possible interactions with the overall results obtained. In this paper we present some experiments supporting the hypothesis that the quality of lexical resources used for tagging the training corpora of WSD systems partly determines the quality of the results. In order to verify this initial hypothesis we have developed two kinds of experiments. At the linguistic level, we have tested the quality of lexical resources in terms of the annotators' agreement degree. From the computational point of view, we have evaluated how those different lexical resources affect the accuracy of the resulting WSD classifiers. We have carried out these experiments using three different lexical resources as sense inventories and a fixed WSD system based on Support Vector Machines.

Original languageEnglish
Title of host publicationLecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science)
EditorsJ.L. Vicedo, P. Martinez-Barco, R. Munoz, M. S. Noeda
Pages291-302
Number of pages12
Volume3230
Publication statusPublished - 2004
Externally publishedYes
Event4th International Conference EsTAL 2004 - Advances in Natural Language Processing - Alicante, Spain
Duration: 20 Oct 200422 Oct 2004

Other

Other4th International Conference EsTAL 2004 - Advances in Natural Language Processing
CountrySpain
CityAlicante
Period20/10/0422/10/04

Fingerprint

Experiments
Linguistics
Support vector machines
Classifiers

ASJC Scopus subject areas

  • Hardware and Architecture

Cite this

Marques, L., Taulé, M., Padró, L., Villarejo, L., & Martí, M. A. (2004). On the quality of lexical resources for word sense disambiguation. In J. L. Vicedo, P. Martinez-Barco, R. Munoz, & M. S. Noeda (Eds.), Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) (Vol. 3230, pp. 291-302)

On the quality of lexical resources for word sense disambiguation. / Marques, Lluis; Taulé, Mariona; Padró, Lluís; Villarejo, Luis; Martí, Maria Antònia.

Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science). ed. / J.L. Vicedo; P. Martinez-Barco; R. Munoz; M. S. Noeda. Vol. 3230 2004. p. 291-302.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Marques, L, Taulé, M, Padró, L, Villarejo, L & Martí, MA 2004, On the quality of lexical resources for word sense disambiguation. in JL Vicedo, P Martinez-Barco, R Munoz & MS Noeda (eds), Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science). vol. 3230, pp. 291-302, 4th International Conference EsTAL 2004 - Advances in Natural Language Processing, Alicante, Spain, 20/10/04.
Marques L, Taulé M, Padró L, Villarejo L, Martí MA. On the quality of lexical resources for word sense disambiguation. In Vicedo JL, Martinez-Barco P, Munoz R, Noeda MS, editors, Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science). Vol. 3230. 2004. p. 291-302
Marques, Lluis ; Taulé, Mariona ; Padró, Lluís ; Villarejo, Luis ; Martí, Maria Antònia. / On the quality of lexical resources for word sense disambiguation. Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science). editor / J.L. Vicedo ; P. Martinez-Barco ; R. Munoz ; M. S. Noeda. Vol. 3230 2004. pp. 291-302
@inproceedings{2df9a85ebe23428aa0b77e0aa20219f4,
title = "On the quality of lexical resources for word sense disambiguation",
abstract = "Word Sense Disambiguation (WSD) systems are usually evaluated by comparing their absolute performance, in a fixed experimental setting, to other alternative algorithms and methods. However, little attention has been paid to analyze the lexical resources and the corpora defining the experimental settings and their possible interactions with the overall results obtained. In this paper we present some experiments supporting the hypothesis that the quality of lexical resources used for tagging the training corpora of WSD systems partly determines the quality of the results. In order to verify this initial hypothesis we have developed two kinds of experiments. At the linguistic level, we have tested the quality of lexical resources in terms of the annotators' agreement degree. From the computational point of view, we have evaluated how those different lexical resources affect the accuracy of the resulting WSD classifiers. We have carried out these experiments using three different lexical resources as sense inventories and a fixed WSD system based on Support Vector Machines.",
author = "Lluis Marques and Mariona Taul{\'e} and Llu{\'i}s Padr{\'o} and Luis Villarejo and Mart{\'i}, {Maria Ant{\`o}nia}",
year = "2004",
language = "English",
volume = "3230",
pages = "291--302",
editor = "J.L. Vicedo and P. Martinez-Barco and R. Munoz and Noeda, {M. S.}",
booktitle = "Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science)",

}

TY - GEN

T1 - On the quality of lexical resources for word sense disambiguation

AU - Marques, Lluis

AU - Taulé, Mariona

AU - Padró, Lluís

AU - Villarejo, Luis

AU - Martí, Maria Antònia

PY - 2004

Y1 - 2004

N2 - Word Sense Disambiguation (WSD) systems are usually evaluated by comparing their absolute performance, in a fixed experimental setting, to other alternative algorithms and methods. However, little attention has been paid to analyze the lexical resources and the corpora defining the experimental settings and their possible interactions with the overall results obtained. In this paper we present some experiments supporting the hypothesis that the quality of lexical resources used for tagging the training corpora of WSD systems partly determines the quality of the results. In order to verify this initial hypothesis we have developed two kinds of experiments. At the linguistic level, we have tested the quality of lexical resources in terms of the annotators' agreement degree. From the computational point of view, we have evaluated how those different lexical resources affect the accuracy of the resulting WSD classifiers. We have carried out these experiments using three different lexical resources as sense inventories and a fixed WSD system based on Support Vector Machines.

AB - Word Sense Disambiguation (WSD) systems are usually evaluated by comparing their absolute performance, in a fixed experimental setting, to other alternative algorithms and methods. However, little attention has been paid to analyze the lexical resources and the corpora defining the experimental settings and their possible interactions with the overall results obtained. In this paper we present some experiments supporting the hypothesis that the quality of lexical resources used for tagging the training corpora of WSD systems partly determines the quality of the results. In order to verify this initial hypothesis we have developed two kinds of experiments. At the linguistic level, we have tested the quality of lexical resources in terms of the annotators' agreement degree. From the computational point of view, we have evaluated how those different lexical resources affect the accuracy of the resulting WSD classifiers. We have carried out these experiments using three different lexical resources as sense inventories and a fixed WSD system based on Support Vector Machines.

UR - http://www.scopus.com/inward/record.url?scp=22944487989&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=22944487989&partnerID=8YFLogxK

M3 - Conference contribution

VL - 3230

SP - 291

EP - 302

BT - Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science)

A2 - Vicedo, J.L.

A2 - Martinez-Barco, P.

A2 - Munoz, R.

A2 - Noeda, M. S.

ER -