Distributional correspondence indexing for cross-lingual and cross-domain sentiment classification

Alejandro Moreo Fernández, Andrea Esuli, Fabrizio Sebastiani

Research output: Contribution to journalArticle

11 Citations (Scopus)

Abstract

Domain Adaptation (DA) techniques aim at enabling machine learning methods learn effective classifiers for a "target" domain when the only available training data belongs to a different "source" domain. In this paper we present the Distributional Correspondence Indexing (DCI) method for domain adaptation in sentiment classification. DCI derives term representations in a vector space common to both domains where each dimension reects its distributional correspondence to a pivot, i.e., to a highly predictive term that behaves similarly across domains. Term correspondence is quantified by means of a distributional correspondence function (DCF). We propose a number of efficient DCFs that are motivated by the distributional hypothesis, i.e., the hypothesis according to which terms with similar meaning tend to have similar distributions in text. Experiments show that DCI obtains better performance than current state-of-the-art techniques for cross-lingual and cross-domain sentiment classification. DCI also brings about a significantly reduced computational cost, and requires a smaller amount of human intervention. As a final contribution, we discuss a more challenging formulation of the domain adaptation problem, in which both the cross-domain and cross-lingual dimensions are tackled simultaneously.

Original languageEnglish
Pages (from-to)131-163
Number of pages33
JournalJournal of Artificial Intelligence Research
Volume55
Publication statusPublished - 1 Jan 2016

Fingerprint

Vector spaces
Learning systems
Classifiers
Costs
Experiments

ASJC Scopus subject areas

  • Artificial Intelligence

Cite this

Distributional correspondence indexing for cross-lingual and cross-domain sentiment classification. / Fernández, Alejandro Moreo; Esuli, Andrea; Sebastiani, Fabrizio.

In: Journal of Artificial Intelligence Research, Vol. 55, 01.01.2016, p. 131-163.

Research output: Contribution to journalArticle

Fernández, Alejandro Moreo ; Esuli, Andrea ; Sebastiani, Fabrizio. / Distributional correspondence indexing for cross-lingual and cross-domain sentiment classification. In: Journal of Artificial Intelligence Research. 2016 ; Vol. 55. pp. 131-163.
@article{83da1211d6314d63ba0c375ba4fcf6e0,
title = "Distributional correspondence indexing for cross-lingual and cross-domain sentiment classification",
abstract = "Domain Adaptation (DA) techniques aim at enabling machine learning methods learn effective classifiers for a {"}target{"} domain when the only available training data belongs to a different {"}source{"} domain. In this paper we present the Distributional Correspondence Indexing (DCI) method for domain adaptation in sentiment classification. DCI derives term representations in a vector space common to both domains where each dimension reects its distributional correspondence to a pivot, i.e., to a highly predictive term that behaves similarly across domains. Term correspondence is quantified by means of a distributional correspondence function (DCF). We propose a number of efficient DCFs that are motivated by the distributional hypothesis, i.e., the hypothesis according to which terms with similar meaning tend to have similar distributions in text. Experiments show that DCI obtains better performance than current state-of-the-art techniques for cross-lingual and cross-domain sentiment classification. DCI also brings about a significantly reduced computational cost, and requires a smaller amount of human intervention. As a final contribution, we discuss a more challenging formulation of the domain adaptation problem, in which both the cross-domain and cross-lingual dimensions are tackled simultaneously.",
author = "Fern{\'a}ndez, {Alejandro Moreo} and Andrea Esuli and Fabrizio Sebastiani",
year = "2016",
month = "1",
day = "1",
language = "English",
volume = "55",
pages = "131--163",
journal = "Journal of Artificial Intelligence Research",
issn = "1076-9757",
publisher = "Morgan Kaufmann Publishers, Inc.",

}

TY - JOUR

T1 - Distributional correspondence indexing for cross-lingual and cross-domain sentiment classification

AU - Fernández, Alejandro Moreo

AU - Esuli, Andrea

AU - Sebastiani, Fabrizio

PY - 2016/1/1

Y1 - 2016/1/1

N2 - Domain Adaptation (DA) techniques aim at enabling machine learning methods learn effective classifiers for a "target" domain when the only available training data belongs to a different "source" domain. In this paper we present the Distributional Correspondence Indexing (DCI) method for domain adaptation in sentiment classification. DCI derives term representations in a vector space common to both domains where each dimension reects its distributional correspondence to a pivot, i.e., to a highly predictive term that behaves similarly across domains. Term correspondence is quantified by means of a distributional correspondence function (DCF). We propose a number of efficient DCFs that are motivated by the distributional hypothesis, i.e., the hypothesis according to which terms with similar meaning tend to have similar distributions in text. Experiments show that DCI obtains better performance than current state-of-the-art techniques for cross-lingual and cross-domain sentiment classification. DCI also brings about a significantly reduced computational cost, and requires a smaller amount of human intervention. As a final contribution, we discuss a more challenging formulation of the domain adaptation problem, in which both the cross-domain and cross-lingual dimensions are tackled simultaneously.

AB - Domain Adaptation (DA) techniques aim at enabling machine learning methods learn effective classifiers for a "target" domain when the only available training data belongs to a different "source" domain. In this paper we present the Distributional Correspondence Indexing (DCI) method for domain adaptation in sentiment classification. DCI derives term representations in a vector space common to both domains where each dimension reects its distributional correspondence to a pivot, i.e., to a highly predictive term that behaves similarly across domains. Term correspondence is quantified by means of a distributional correspondence function (DCF). We propose a number of efficient DCFs that are motivated by the distributional hypothesis, i.e., the hypothesis according to which terms with similar meaning tend to have similar distributions in text. Experiments show that DCI obtains better performance than current state-of-the-art techniques for cross-lingual and cross-domain sentiment classification. DCI also brings about a significantly reduced computational cost, and requires a smaller amount of human intervention. As a final contribution, we discuss a more challenging formulation of the domain adaptation problem, in which both the cross-domain and cross-lingual dimensions are tackled simultaneously.

UR - http://www.scopus.com/inward/record.url?scp=84958175298&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84958175298&partnerID=8YFLogxK

M3 - Article

VL - 55

SP - 131

EP - 163

JO - Journal of Artificial Intelligence Research

JF - Journal of Artificial Intelligence Research

SN - 1076-9757

ER -