Supersense tagging for Arabic

The MT-in-The-middle attack

Nathan Schneider, Behrang Mohit, Chris Dyer, Kemal Oflazer, Noah A. Smith

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

We consider the task of tagging Arabic nouns with WordNet supersenses. Three approaches are evaluated. The first uses an expertcrafted but limited-coverage lexicon, Arabic WordNet, and heuristics. The second uses unsupervised sequence modeling. The third and most successful approach uses machine translation to translate the Arabic into English, which is automatically tagged with English supersenses, the results of which are then projected back into Arabic. Analysis shows gains and remaining obstacles in four Wikipedia topical domains.

Original languageEnglish
Title of host publicationNAACL HLT 2013 - 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Main Conference
PublisherAssociation for Computational Linguistics (ACL)
Pages661-667
Number of pages7
ISBN (Print)9781937284473
Publication statusPublished - 2013
Externally publishedYes
Event2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2013 - Atlanta, United States
Duration: 9 Jun 201314 Jun 2013

Other

Other2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2013
CountryUnited States
CityAtlanta
Period9/6/1314/6/13

Fingerprint

Wikipedia
heuristics
coverage
WordNet
Attack
Tagging
Lexicon
Nouns
Heuristics
Modeling
Machine Translation

ASJC Scopus subject areas

  • Language and Linguistics
  • Computer Science Applications
  • Linguistics and Language

Cite this

Schneider, N., Mohit, B., Dyer, C., Oflazer, K., & Smith, N. A. (2013). Supersense tagging for Arabic: The MT-in-The-middle attack. In NAACL HLT 2013 - 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Main Conference (pp. 661-667). Association for Computational Linguistics (ACL).

Supersense tagging for Arabic : The MT-in-The-middle attack. / Schneider, Nathan; Mohit, Behrang; Dyer, Chris; Oflazer, Kemal; Smith, Noah A.

NAACL HLT 2013 - 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Main Conference. Association for Computational Linguistics (ACL), 2013. p. 661-667.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Schneider, N, Mohit, B, Dyer, C, Oflazer, K & Smith, NA 2013, Supersense tagging for Arabic: The MT-in-The-middle attack. in NAACL HLT 2013 - 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Main Conference. Association for Computational Linguistics (ACL), pp. 661-667, 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2013, Atlanta, United States, 9/6/13.
Schneider N, Mohit B, Dyer C, Oflazer K, Smith NA. Supersense tagging for Arabic: The MT-in-The-middle attack. In NAACL HLT 2013 - 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Main Conference. Association for Computational Linguistics (ACL). 2013. p. 661-667
Schneider, Nathan ; Mohit, Behrang ; Dyer, Chris ; Oflazer, Kemal ; Smith, Noah A. / Supersense tagging for Arabic : The MT-in-The-middle attack. NAACL HLT 2013 - 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Main Conference. Association for Computational Linguistics (ACL), 2013. pp. 661-667
@inproceedings{032108c1d3864cd7b7bc2da731406c41,
title = "Supersense tagging for Arabic: The MT-in-The-middle attack",
abstract = "We consider the task of tagging Arabic nouns with WordNet supersenses. Three approaches are evaluated. The first uses an expertcrafted but limited-coverage lexicon, Arabic WordNet, and heuristics. The second uses unsupervised sequence modeling. The third and most successful approach uses machine translation to translate the Arabic into English, which is automatically tagged with English supersenses, the results of which are then projected back into Arabic. Analysis shows gains and remaining obstacles in four Wikipedia topical domains.",
author = "Nathan Schneider and Behrang Mohit and Chris Dyer and Kemal Oflazer and Smith, {Noah A.}",
year = "2013",
language = "English",
isbn = "9781937284473",
pages = "661--667",
booktitle = "NAACL HLT 2013 - 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Main Conference",
publisher = "Association for Computational Linguistics (ACL)",

}

TY - GEN

T1 - Supersense tagging for Arabic

T2 - The MT-in-The-middle attack

AU - Schneider, Nathan

AU - Mohit, Behrang

AU - Dyer, Chris

AU - Oflazer, Kemal

AU - Smith, Noah A.

PY - 2013

Y1 - 2013

N2 - We consider the task of tagging Arabic nouns with WordNet supersenses. Three approaches are evaluated. The first uses an expertcrafted but limited-coverage lexicon, Arabic WordNet, and heuristics. The second uses unsupervised sequence modeling. The third and most successful approach uses machine translation to translate the Arabic into English, which is automatically tagged with English supersenses, the results of which are then projected back into Arabic. Analysis shows gains and remaining obstacles in four Wikipedia topical domains.

AB - We consider the task of tagging Arabic nouns with WordNet supersenses. Three approaches are evaluated. The first uses an expertcrafted but limited-coverage lexicon, Arabic WordNet, and heuristics. The second uses unsupervised sequence modeling. The third and most successful approach uses machine translation to translate the Arabic into English, which is automatically tagged with English supersenses, the results of which are then projected back into Arabic. Analysis shows gains and remaining obstacles in four Wikipedia topical domains.

UR - http://www.scopus.com/inward/record.url?scp=84906923987&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84906923987&partnerID=8YFLogxK

M3 - Conference contribution

SN - 9781937284473

SP - 661

EP - 667

BT - NAACL HLT 2013 - 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Main Conference

PB - Association for Computational Linguistics (ACL)

ER -