SVMTool

A general POS tagger generator based on support vector machines

Jesús Giménez, Lluis Marques

Research output: Chapter in Book/Report/Conference proceedingConference contribution

193 Citations (Scopus)

Abstract

This paper presents the SVMTool, a simple, flexible, effective and efficient part-of-speech tagger based on Support Vector Machines. The SVMTool offers a fairly good balance among these properties which make it really practical for current NLP applications. It is very easy to use and easily configurable so as to perfectly fit the needs of a number of different applications. Results are also very competitive, achieving an accuracy of 97.16% for English on the Wall Street Journal corpus. It has been also successfully applied to Spanish exhibiting a similar performance. A first release of the SVMTool Perl prototype is now freely available for public use. A most efficient C++ version is coming very soon.

Original languageEnglish
Title of host publicationProceedings of the 4th International Conference on Language Resources and Evaluation, LREC 2004
PublisherEuropean Language Resources Association (ELRA)
Pages43-46
Number of pages4
ISBN (Electronic)2951740816, 9782951740815
Publication statusPublished - 1 Jan 2004
Event4th International Conference on Language Resources and Evaluation, LREC 2004 - Lisbon, Portugal
Duration: 26 May 200428 May 2004

Other

Other4th International Conference on Language Resources and Evaluation, LREC 2004
CountryPortugal
CityLisbon
Period26/5/0428/5/04

Fingerprint

performance
Support Vector Machine
Tag
Prototype
Wall Street Journal
Natural Language Processing
Part of Speech

ASJC Scopus subject areas

  • Library and Information Sciences
  • Education
  • Language and Linguistics
  • Linguistics and Language

Cite this

Giménez, J., & Marques, L. (2004). SVMTool: A general POS tagger generator based on support vector machines. In Proceedings of the 4th International Conference on Language Resources and Evaluation, LREC 2004 (pp. 43-46). European Language Resources Association (ELRA).

SVMTool : A general POS tagger generator based on support vector machines. / Giménez, Jesús; Marques, Lluis.

Proceedings of the 4th International Conference on Language Resources and Evaluation, LREC 2004. European Language Resources Association (ELRA), 2004. p. 43-46.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Giménez, J & Marques, L 2004, SVMTool: A general POS tagger generator based on support vector machines. in Proceedings of the 4th International Conference on Language Resources and Evaluation, LREC 2004. European Language Resources Association (ELRA), pp. 43-46, 4th International Conference on Language Resources and Evaluation, LREC 2004, Lisbon, Portugal, 26/5/04.
Giménez J, Marques L. SVMTool: A general POS tagger generator based on support vector machines. In Proceedings of the 4th International Conference on Language Resources and Evaluation, LREC 2004. European Language Resources Association (ELRA). 2004. p. 43-46
Giménez, Jesús ; Marques, Lluis. / SVMTool : A general POS tagger generator based on support vector machines. Proceedings of the 4th International Conference on Language Resources and Evaluation, LREC 2004. European Language Resources Association (ELRA), 2004. pp. 43-46
@inproceedings{3e12b4cf7d5a4c41824d30df9c43bbfa,
title = "SVMTool: A general POS tagger generator based on support vector machines",
abstract = "This paper presents the SVMTool, a simple, flexible, effective and efficient part-of-speech tagger based on Support Vector Machines. The SVMTool offers a fairly good balance among these properties which make it really practical for current NLP applications. It is very easy to use and easily configurable so as to perfectly fit the needs of a number of different applications. Results are also very competitive, achieving an accuracy of 97.16{\%} for English on the Wall Street Journal corpus. It has been also successfully applied to Spanish exhibiting a similar performance. A first release of the SVMTool Perl prototype is now freely available for public use. A most efficient C++ version is coming very soon.",
author = "Jes{\'u}s Gim{\'e}nez and Lluis Marques",
year = "2004",
month = "1",
day = "1",
language = "English",
pages = "43--46",
booktitle = "Proceedings of the 4th International Conference on Language Resources and Evaluation, LREC 2004",
publisher = "European Language Resources Association (ELRA)",

}

TY - GEN

T1 - SVMTool

T2 - A general POS tagger generator based on support vector machines

AU - Giménez, Jesús

AU - Marques, Lluis

PY - 2004/1/1

Y1 - 2004/1/1

N2 - This paper presents the SVMTool, a simple, flexible, effective and efficient part-of-speech tagger based on Support Vector Machines. The SVMTool offers a fairly good balance among these properties which make it really practical for current NLP applications. It is very easy to use and easily configurable so as to perfectly fit the needs of a number of different applications. Results are also very competitive, achieving an accuracy of 97.16% for English on the Wall Street Journal corpus. It has been also successfully applied to Spanish exhibiting a similar performance. A first release of the SVMTool Perl prototype is now freely available for public use. A most efficient C++ version is coming very soon.

AB - This paper presents the SVMTool, a simple, flexible, effective and efficient part-of-speech tagger based on Support Vector Machines. The SVMTool offers a fairly good balance among these properties which make it really practical for current NLP applications. It is very easy to use and easily configurable so as to perfectly fit the needs of a number of different applications. Results are also very competitive, achieving an accuracy of 97.16% for English on the Wall Street Journal corpus. It has been also successfully applied to Spanish exhibiting a similar performance. A first release of the SVMTool Perl prototype is now freely available for public use. A most efficient C++ version is coming very soon.

UR - http://www.scopus.com/inward/record.url?scp=85035364491&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85035364491&partnerID=8YFLogxK

M3 - Conference contribution

SP - 43

EP - 46

BT - Proceedings of the 4th International Conference on Language Resources and Evaluation, LREC 2004

PB - European Language Resources Association (ELRA)

ER -