Crossing media streams with sentiment: Domain adaptation in blogs, reviews and Twitter

Yelena Mejova, Padmini Srinivasan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

12 Citations (Scopus)

Abstract

Most sentiment analysis studies address classification of a single source of data such as reviews or blog posts. However, the multitude of social media sources available for text analysis lends itself naturally to domain adaptation. In this study, we create a dataset spanning three social media sources - blogs, reviews, and Twitter - and a set of 37 common topics. We first examine sentiments expressed in these three sources while controlling for the change in topic. Then using this multidimensional data we show that when classifying documents in one source (a target source), models trained on other sources of data can be as good as or even better than those trained on the target data. That is, we show that models trained on some social media sources are generalizable to others. All source adaptation models we implement show reviews and Twitter to be the best sources of training data. It is especially useful to know that models trained on Twitter data are generalizable, since, unlike reviews, Twitter is more topically diverse.

Original languageEnglish
Title of host publicationICWSM 2012 - Proceedings of the 6th International AAAI Conference on Weblogs and Social Media
Pages234-241
Number of pages8
Publication statusPublished - 1 Dec 2012
Externally publishedYes
Event6th International AAAI Conference on Weblogs and Social Media, ICWSM 2012 - Dublin, Ireland
Duration: 4 Jun 20127 Jun 2012

Other

Other6th International AAAI Conference on Weblogs and Social Media, ICWSM 2012
CountryIreland
CityDublin
Period4/6/127/6/12

Fingerprint

Blogs

ASJC Scopus subject areas

  • Computer Networks and Communications

Cite this

Mejova, Y., & Srinivasan, P. (2012). Crossing media streams with sentiment: Domain adaptation in blogs, reviews and Twitter. In ICWSM 2012 - Proceedings of the 6th International AAAI Conference on Weblogs and Social Media (pp. 234-241)

Crossing media streams with sentiment : Domain adaptation in blogs, reviews and Twitter. / Mejova, Yelena; Srinivasan, Padmini.

ICWSM 2012 - Proceedings of the 6th International AAAI Conference on Weblogs and Social Media. 2012. p. 234-241.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Mejova, Y & Srinivasan, P 2012, Crossing media streams with sentiment: Domain adaptation in blogs, reviews and Twitter. in ICWSM 2012 - Proceedings of the 6th International AAAI Conference on Weblogs and Social Media. pp. 234-241, 6th International AAAI Conference on Weblogs and Social Media, ICWSM 2012, Dublin, Ireland, 4/6/12.
Mejova Y, Srinivasan P. Crossing media streams with sentiment: Domain adaptation in blogs, reviews and Twitter. In ICWSM 2012 - Proceedings of the 6th International AAAI Conference on Weblogs and Social Media. 2012. p. 234-241
Mejova, Yelena ; Srinivasan, Padmini. / Crossing media streams with sentiment : Domain adaptation in blogs, reviews and Twitter. ICWSM 2012 - Proceedings of the 6th International AAAI Conference on Weblogs and Social Media. 2012. pp. 234-241
@inproceedings{672fd2e7b739493c98e32f52cd1d36af,
title = "Crossing media streams with sentiment: Domain adaptation in blogs, reviews and Twitter",
abstract = "Most sentiment analysis studies address classification of a single source of data such as reviews or blog posts. However, the multitude of social media sources available for text analysis lends itself naturally to domain adaptation. In this study, we create a dataset spanning three social media sources - blogs, reviews, and Twitter - and a set of 37 common topics. We first examine sentiments expressed in these three sources while controlling for the change in topic. Then using this multidimensional data we show that when classifying documents in one source (a target source), models trained on other sources of data can be as good as or even better than those trained on the target data. That is, we show that models trained on some social media sources are generalizable to others. All source adaptation models we implement show reviews and Twitter to be the best sources of training data. It is especially useful to know that models trained on Twitter data are generalizable, since, unlike reviews, Twitter is more topically diverse.",
author = "Yelena Mejova and Padmini Srinivasan",
year = "2012",
month = "12",
day = "1",
language = "English",
isbn = "9781577355564",
pages = "234--241",
booktitle = "ICWSM 2012 - Proceedings of the 6th International AAAI Conference on Weblogs and Social Media",

}

TY - GEN

T1 - Crossing media streams with sentiment

T2 - Domain adaptation in blogs, reviews and Twitter

AU - Mejova, Yelena

AU - Srinivasan, Padmini

PY - 2012/12/1

Y1 - 2012/12/1

N2 - Most sentiment analysis studies address classification of a single source of data such as reviews or blog posts. However, the multitude of social media sources available for text analysis lends itself naturally to domain adaptation. In this study, we create a dataset spanning three social media sources - blogs, reviews, and Twitter - and a set of 37 common topics. We first examine sentiments expressed in these three sources while controlling for the change in topic. Then using this multidimensional data we show that when classifying documents in one source (a target source), models trained on other sources of data can be as good as or even better than those trained on the target data. That is, we show that models trained on some social media sources are generalizable to others. All source adaptation models we implement show reviews and Twitter to be the best sources of training data. It is especially useful to know that models trained on Twitter data are generalizable, since, unlike reviews, Twitter is more topically diverse.

AB - Most sentiment analysis studies address classification of a single source of data such as reviews or blog posts. However, the multitude of social media sources available for text analysis lends itself naturally to domain adaptation. In this study, we create a dataset spanning three social media sources - blogs, reviews, and Twitter - and a set of 37 common topics. We first examine sentiments expressed in these three sources while controlling for the change in topic. Then using this multidimensional data we show that when classifying documents in one source (a target source), models trained on other sources of data can be as good as or even better than those trained on the target data. That is, we show that models trained on some social media sources are generalizable to others. All source adaptation models we implement show reviews and Twitter to be the best sources of training data. It is especially useful to know that models trained on Twitter data are generalizable, since, unlike reviews, Twitter is more topically diverse.

UR - http://www.scopus.com/inward/record.url?scp=84890586735&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84890586735&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84890586735

SN - 9781577355564

SP - 234

EP - 241

BT - ICWSM 2012 - Proceedings of the 6th International AAAI Conference on Weblogs and Social Media

ER -