Sentiment analysis in twitter for Macedonian

Dame Jovanoski, Veno Pachovski, Preslav Nakov

Research output: Chapter in Book/Report/Conference proceedingConference contribution

17 Citations (Scopus)

Abstract

We present work on sentiment analysis in Twitter for Macedonian. As this is pioneering work for this combination of language and genre, we created suitable resources for training and evaluating a system for sentiment analysis of Macedonian tweets. In particular, we developed a corpus of tweets annotated with tweet-level sentiment polarity (positive, negative, and neutral), as well as with phrase-level sentiment, which we made freely available for research purposes. We further bootstrapped several large-scale sentiment lexicons for Macedonian, motivated by previous work for English. The impact of several different pre-processing steps as well as of various features is shown in experiments that represent the first attempt to build a system for sentiment analysis in Twitter for the morphologically rich Macedonian language. Overall, our experimental results show an F1-score of 92.16, which is very strong and is on par with the best results for English, which were achieved in recent SemEval competitions.

Original languageEnglish
Title of host publicationInternational Conference Recent Advances in Natural Language Processing, RANLP
PublisherAssociation for Computational Linguistics (ACL)
Pages249-257
Number of pages9
Volume2015-January
Publication statusPublished - 2015
Event10th International Conference on Recent Advances in Natural Language Processing, RANLP 2015 - Hissar, Bulgaria
Duration: 7 Sep 20159 Sep 2015

Other

Other10th International Conference on Recent Advances in Natural Language Processing, RANLP 2015
CountryBulgaria
CityHissar
Period7/9/159/9/15

Fingerprint

Processing
Experiments

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Science Applications
  • Software
  • Electrical and Electronic Engineering

Cite this

Jovanoski, D., Pachovski, V., & Nakov, P. (2015). Sentiment analysis in twitter for Macedonian. In International Conference Recent Advances in Natural Language Processing, RANLP (Vol. 2015-January, pp. 249-257). Association for Computational Linguistics (ACL).

Sentiment analysis in twitter for Macedonian. / Jovanoski, Dame; Pachovski, Veno; Nakov, Preslav.

International Conference Recent Advances in Natural Language Processing, RANLP. Vol. 2015-January Association for Computational Linguistics (ACL), 2015. p. 249-257.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Jovanoski, D, Pachovski, V & Nakov, P 2015, Sentiment analysis in twitter for Macedonian. in International Conference Recent Advances in Natural Language Processing, RANLP. vol. 2015-January, Association for Computational Linguistics (ACL), pp. 249-257, 10th International Conference on Recent Advances in Natural Language Processing, RANLP 2015, Hissar, Bulgaria, 7/9/15.
Jovanoski D, Pachovski V, Nakov P. Sentiment analysis in twitter for Macedonian. In International Conference Recent Advances in Natural Language Processing, RANLP. Vol. 2015-January. Association for Computational Linguistics (ACL). 2015. p. 249-257
Jovanoski, Dame ; Pachovski, Veno ; Nakov, Preslav. / Sentiment analysis in twitter for Macedonian. International Conference Recent Advances in Natural Language Processing, RANLP. Vol. 2015-January Association for Computational Linguistics (ACL), 2015. pp. 249-257
@inproceedings{67226e7a60ad4e878b2fee6e8ff7b370,
title = "Sentiment analysis in twitter for Macedonian",
abstract = "We present work on sentiment analysis in Twitter for Macedonian. As this is pioneering work for this combination of language and genre, we created suitable resources for training and evaluating a system for sentiment analysis of Macedonian tweets. In particular, we developed a corpus of tweets annotated with tweet-level sentiment polarity (positive, negative, and neutral), as well as with phrase-level sentiment, which we made freely available for research purposes. We further bootstrapped several large-scale sentiment lexicons for Macedonian, motivated by previous work for English. The impact of several different pre-processing steps as well as of various features is shown in experiments that represent the first attempt to build a system for sentiment analysis in Twitter for the morphologically rich Macedonian language. Overall, our experimental results show an F1-score of 92.16, which is very strong and is on par with the best results for English, which were achieved in recent SemEval competitions.",
author = "Dame Jovanoski and Veno Pachovski and Preslav Nakov",
year = "2015",
language = "English",
volume = "2015-January",
pages = "249--257",
booktitle = "International Conference Recent Advances in Natural Language Processing, RANLP",
publisher = "Association for Computational Linguistics (ACL)",

}

TY - GEN

T1 - Sentiment analysis in twitter for Macedonian

AU - Jovanoski, Dame

AU - Pachovski, Veno

AU - Nakov, Preslav

PY - 2015

Y1 - 2015

N2 - We present work on sentiment analysis in Twitter for Macedonian. As this is pioneering work for this combination of language and genre, we created suitable resources for training and evaluating a system for sentiment analysis of Macedonian tweets. In particular, we developed a corpus of tweets annotated with tweet-level sentiment polarity (positive, negative, and neutral), as well as with phrase-level sentiment, which we made freely available for research purposes. We further bootstrapped several large-scale sentiment lexicons for Macedonian, motivated by previous work for English. The impact of several different pre-processing steps as well as of various features is shown in experiments that represent the first attempt to build a system for sentiment analysis in Twitter for the morphologically rich Macedonian language. Overall, our experimental results show an F1-score of 92.16, which is very strong and is on par with the best results for English, which were achieved in recent SemEval competitions.

AB - We present work on sentiment analysis in Twitter for Macedonian. As this is pioneering work for this combination of language and genre, we created suitable resources for training and evaluating a system for sentiment analysis of Macedonian tweets. In particular, we developed a corpus of tweets annotated with tweet-level sentiment polarity (positive, negative, and neutral), as well as with phrase-level sentiment, which we made freely available for research purposes. We further bootstrapped several large-scale sentiment lexicons for Macedonian, motivated by previous work for English. The impact of several different pre-processing steps as well as of various features is shown in experiments that represent the first attempt to build a system for sentiment analysis in Twitter for the morphologically rich Macedonian language. Overall, our experimental results show an F1-score of 92.16, which is very strong and is on par with the best results for English, which were achieved in recent SemEval competitions.

UR - http://www.scopus.com/inward/record.url?scp=84949751093&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84949751093&partnerID=8YFLogxK

M3 - Conference contribution

VL - 2015-January

SP - 249

EP - 257

BT - International Conference Recent Advances in Natural Language Processing, RANLP

PB - Association for Computational Linguistics (ACL)

ER -