Proppy

Organizing the news based on their propagandistic content

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Propaganda is a mechanism to influence public opinion, which is inherently present in extremely biased and fake news. Here, we propose a model to automatically assess the level of propagandistic content in an article based on different representations, from writing style and readability level to the presence of certain keywords. We experiment thoroughly with different variations of such a model on a new publicly available corpus, and we show that character n-grams and other style features outperform existing alternatives to identify propaganda based on word n-grams. Unlike previous work, we make sure that the test data comes from news sources that were unseen on training, thus penalizing learning algorithms that model the news sources used at training time as opposed to solving the actual task. We integrate our supervised model in a public website, which organizes recent articles covering the same event on the basis of their propagandistic contents. This allows users to quickly explore different perspectives of the same story, and it also enables investigative journalists to dig further into how different media use stories and propaganda to pursue their agenda.

Original languageEnglish
JournalInformation Processing and Management
DOIs
Publication statusPublished - 1 Jan 2019

Fingerprint

news
propaganda
journalist
Learning algorithms
public opinion
website
Websites
News
Organizing
event
experiment
learning
Propaganda
Experiments

Keywords

  • Investigative journalism
  • News bias
  • Propaganda detection

ASJC Scopus subject areas

  • Information Systems
  • Media Technology
  • Computer Science Applications
  • Management Science and Operations Research
  • Library and Information Sciences

Cite this

@article{fd6070481e3846cf99badac1fa234563,
title = "Proppy: Organizing the news based on their propagandistic content",
abstract = "Propaganda is a mechanism to influence public opinion, which is inherently present in extremely biased and fake news. Here, we propose a model to automatically assess the level of propagandistic content in an article based on different representations, from writing style and readability level to the presence of certain keywords. We experiment thoroughly with different variations of such a model on a new publicly available corpus, and we show that character n-grams and other style features outperform existing alternatives to identify propaganda based on word n-grams. Unlike previous work, we make sure that the test data comes from news sources that were unseen on training, thus penalizing learning algorithms that model the news sources used at training time as opposed to solving the actual task. We integrate our supervised model in a public website, which organizes recent articles covering the same event on the basis of their propagandistic contents. This allows users to quickly explore different perspectives of the same story, and it also enables investigative journalists to dig further into how different media use stories and propaganda to pursue their agenda.",
keywords = "Investigative journalism, News bias, Propaganda detection",
author = "Alberto Barron and Israa Jaradat and Giovanni Martino and Preslav Nakov",
year = "2019",
month = "1",
day = "1",
doi = "10.1016/j.ipm.2019.03.005",
language = "English",
journal = "Information Processing and Management",
issn = "0306-4573",
publisher = "Elsevier Limited",

}

TY - JOUR

T1 - Proppy

T2 - Organizing the news based on their propagandistic content

AU - Barron, Alberto

AU - Jaradat, Israa

AU - Martino, Giovanni

AU - Nakov, Preslav

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Propaganda is a mechanism to influence public opinion, which is inherently present in extremely biased and fake news. Here, we propose a model to automatically assess the level of propagandistic content in an article based on different representations, from writing style and readability level to the presence of certain keywords. We experiment thoroughly with different variations of such a model on a new publicly available corpus, and we show that character n-grams and other style features outperform existing alternatives to identify propaganda based on word n-grams. Unlike previous work, we make sure that the test data comes from news sources that were unseen on training, thus penalizing learning algorithms that model the news sources used at training time as opposed to solving the actual task. We integrate our supervised model in a public website, which organizes recent articles covering the same event on the basis of their propagandistic contents. This allows users to quickly explore different perspectives of the same story, and it also enables investigative journalists to dig further into how different media use stories and propaganda to pursue their agenda.

AB - Propaganda is a mechanism to influence public opinion, which is inherently present in extremely biased and fake news. Here, we propose a model to automatically assess the level of propagandistic content in an article based on different representations, from writing style and readability level to the presence of certain keywords. We experiment thoroughly with different variations of such a model on a new publicly available corpus, and we show that character n-grams and other style features outperform existing alternatives to identify propaganda based on word n-grams. Unlike previous work, we make sure that the test data comes from news sources that were unseen on training, thus penalizing learning algorithms that model the news sources used at training time as opposed to solving the actual task. We integrate our supervised model in a public website, which organizes recent articles covering the same event on the basis of their propagandistic contents. This allows users to quickly explore different perspectives of the same story, and it also enables investigative journalists to dig further into how different media use stories and propaganda to pursue their agenda.

KW - Investigative journalism

KW - News bias

KW - Propaganda detection

UR - http://www.scopus.com/inward/record.url?scp=85065627668&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85065627668&partnerID=8YFLogxK

U2 - 10.1016/j.ipm.2019.03.005

DO - 10.1016/j.ipm.2019.03.005

M3 - Article

JO - Information Processing and Management

JF - Information Processing and Management

SN - 0306-4573

ER -