Ordinal text quantification

Giovanni Martino, Wei Gao, Fabrizio Sebastiani

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

In recent years there has been a growing interest in text quantification, a supervised learning task where the goal is to accurately estimate, in an unlabelled set of items, the prevalence (or "relative frequency") of each class c in a predefined set C. Text quantification has several applications, and is a dominant concern in fields such as market research, the social sciences, political science, and epidemiology. In this paper we tackle, for the first time, the problem of ordinal text quantification, defined as the task of performing text quantification when a total order is defined on the set of classes; estimating the prevalence of "five stars" reviews in a set of reviews of a given product, and monitoring this prevalence across time, is an example application. We present OQT, a novel tree-based OQ algorithm, and discuss experimental results obtained on a dataset of tweets classified according to sentiment strength.

Original languageEnglish
Title of host publicationSIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval
PublisherAssociation for Computing Machinery, Inc
Pages937-940
Number of pages4
ISBN (Electronic)9781450342902
DOIs
Publication statusPublished - 7 Jul 2016
Event39th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2016 - Pisa, Italy
Duration: 17 Jul 201621 Jul 2016

Other

Other39th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2016
CountryItaly
CityPisa
Period17/7/1621/7/16

Fingerprint

Epidemiology
Social sciences
Supervised learning
Stars
Monitoring

Keywords

  • Ordinal quantification
  • Quantification
  • Sentiment analysis

ASJC Scopus subject areas

  • Information Systems
  • Software

Cite this

Martino, G., Gao, W., & Sebastiani, F. (2016). Ordinal text quantification. In SIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 937-940). Association for Computing Machinery, Inc. https://doi.org/10.1145/2911451.2914749

Ordinal text quantification. / Martino, Giovanni; Gao, Wei; Sebastiani, Fabrizio.

SIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, Inc, 2016. p. 937-940.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Martino, G, Gao, W & Sebastiani, F 2016, Ordinal text quantification. in SIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, Inc, pp. 937-940, 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2016, Pisa, Italy, 17/7/16. https://doi.org/10.1145/2911451.2914749
Martino G, Gao W, Sebastiani F. Ordinal text quantification. In SIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, Inc. 2016. p. 937-940 https://doi.org/10.1145/2911451.2914749
Martino, Giovanni ; Gao, Wei ; Sebastiani, Fabrizio. / Ordinal text quantification. SIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, Inc, 2016. pp. 937-940
@inproceedings{1456762675154e579e73652da198b970,
title = "Ordinal text quantification",
abstract = "In recent years there has been a growing interest in text quantification, a supervised learning task where the goal is to accurately estimate, in an unlabelled set of items, the prevalence (or {"}relative frequency{"}) of each class c in a predefined set C. Text quantification has several applications, and is a dominant concern in fields such as market research, the social sciences, political science, and epidemiology. In this paper we tackle, for the first time, the problem of ordinal text quantification, defined as the task of performing text quantification when a total order is defined on the set of classes; estimating the prevalence of {"}five stars{"} reviews in a set of reviews of a given product, and monitoring this prevalence across time, is an example application. We present OQT, a novel tree-based OQ algorithm, and discuss experimental results obtained on a dataset of tweets classified according to sentiment strength.",
keywords = "Ordinal quantification, Quantification, Sentiment analysis",
author = "Giovanni Martino and Wei Gao and Fabrizio Sebastiani",
year = "2016",
month = "7",
day = "7",
doi = "10.1145/2911451.2914749",
language = "English",
pages = "937--940",
booktitle = "SIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval",
publisher = "Association for Computing Machinery, Inc",

}

TY - GEN

T1 - Ordinal text quantification

AU - Martino, Giovanni

AU - Gao, Wei

AU - Sebastiani, Fabrizio

PY - 2016/7/7

Y1 - 2016/7/7

N2 - In recent years there has been a growing interest in text quantification, a supervised learning task where the goal is to accurately estimate, in an unlabelled set of items, the prevalence (or "relative frequency") of each class c in a predefined set C. Text quantification has several applications, and is a dominant concern in fields such as market research, the social sciences, political science, and epidemiology. In this paper we tackle, for the first time, the problem of ordinal text quantification, defined as the task of performing text quantification when a total order is defined on the set of classes; estimating the prevalence of "five stars" reviews in a set of reviews of a given product, and monitoring this prevalence across time, is an example application. We present OQT, a novel tree-based OQ algorithm, and discuss experimental results obtained on a dataset of tweets classified according to sentiment strength.

AB - In recent years there has been a growing interest in text quantification, a supervised learning task where the goal is to accurately estimate, in an unlabelled set of items, the prevalence (or "relative frequency") of each class c in a predefined set C. Text quantification has several applications, and is a dominant concern in fields such as market research, the social sciences, political science, and epidemiology. In this paper we tackle, for the first time, the problem of ordinal text quantification, defined as the task of performing text quantification when a total order is defined on the set of classes; estimating the prevalence of "five stars" reviews in a set of reviews of a given product, and monitoring this prevalence across time, is an example application. We present OQT, a novel tree-based OQ algorithm, and discuss experimental results obtained on a dataset of tweets classified according to sentiment strength.

KW - Ordinal quantification

KW - Quantification

KW - Sentiment analysis

UR - http://www.scopus.com/inward/record.url?scp=84980349728&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84980349728&partnerID=8YFLogxK

U2 - 10.1145/2911451.2914749

DO - 10.1145/2911451.2914749

M3 - Conference contribution

SP - 937

EP - 940

BT - SIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval

PB - Association for Computing Machinery, Inc

ER -