Developing an online hate classifier for multiple social media platforms

Joni Salminen, Maximilian Hopf, Shammur A. Chowdhury, Soon gyo Jung, Hind Almerekhi, Bernard J. Jansen

Research output: Contribution to journalArticle

Abstract

The proliferation of social media enables people to express their opinions widely online. However, at the same time, this has resulted in the emergence of conflict and hate, making online environments uninviting for users. Although researchers have found that hate is a problem across multiple platforms, there is a lack of models for online hate detection using multi-platform data. To address this research gap, we collect a total of 197,566 comments from four platforms: YouTube, Reddit, Wikipedia, and Twitter, with 80% of the comments labeled as non-hateful and the remaining 20% labeled as hateful. We then experiment with several classification algorithms (Logistic Regression, Naïve Bayes, Support Vector Machines, XGBoost, and Neural Networks) and feature representations (Bag-of-Words, TF-IDF, Word2Vec, BERT, and their combination). While all the models significantly outperform the keyword-based baseline classifier, XGBoost using all features performs the best (F1 = 0.92). Feature importance analysis indicates that BERT features are the most impactful for the predictions. Findings support the generalizability of the best model, as the platform-specific results from Twitter and Wikipedia are comparable to their respective source papers. We make our code publicly available for application in real software systems as well as for further development by online hate researchers.

Original languageEnglish
Article number1
JournalHuman-centric Computing and Information Sciences
Volume10
Issue number1
DOIs
Publication statusPublished - 1 Dec 2020

Fingerprint

Classifiers
Support vector machines
Logistics
Neural networks
Experiments

Keywords

  • Machine learning
  • Online hate
  • Social media
  • Toxicity

ASJC Scopus subject areas

  • Computer Science(all)

Cite this

Developing an online hate classifier for multiple social media platforms. / Salminen, Joni; Hopf, Maximilian; Chowdhury, Shammur A.; Jung, Soon gyo; Almerekhi, Hind; Jansen, Bernard J.

In: Human-centric Computing and Information Sciences, Vol. 10, No. 1, 1, 01.12.2020.

Research output: Contribution to journalArticle

Salminen, Joni ; Hopf, Maximilian ; Chowdhury, Shammur A. ; Jung, Soon gyo ; Almerekhi, Hind ; Jansen, Bernard J. / Developing an online hate classifier for multiple social media platforms. In: Human-centric Computing and Information Sciences. 2020 ; Vol. 10, No. 1.
@article{6bd12fdedbb34b9f910b584d58c8fa96,
title = "Developing an online hate classifier for multiple social media platforms",
abstract = "The proliferation of social media enables people to express their opinions widely online. However, at the same time, this has resulted in the emergence of conflict and hate, making online environments uninviting for users. Although researchers have found that hate is a problem across multiple platforms, there is a lack of models for online hate detection using multi-platform data. To address this research gap, we collect a total of 197,566 comments from four platforms: YouTube, Reddit, Wikipedia, and Twitter, with 80{\%} of the comments labeled as non-hateful and the remaining 20{\%} labeled as hateful. We then experiment with several classification algorithms (Logistic Regression, Na{\"i}ve Bayes, Support Vector Machines, XGBoost, and Neural Networks) and feature representations (Bag-of-Words, TF-IDF, Word2Vec, BERT, and their combination). While all the models significantly outperform the keyword-based baseline classifier, XGBoost using all features performs the best (F1 = 0.92). Feature importance analysis indicates that BERT features are the most impactful for the predictions. Findings support the generalizability of the best model, as the platform-specific results from Twitter and Wikipedia are comparable to their respective source papers. We make our code publicly available for application in real software systems as well as for further development by online hate researchers.",
keywords = "Machine learning, Online hate, Social media, Toxicity",
author = "Joni Salminen and Maximilian Hopf and Chowdhury, {Shammur A.} and Jung, {Soon gyo} and Hind Almerekhi and Jansen, {Bernard J.}",
year = "2020",
month = "12",
day = "1",
doi = "10.1186/s13673-019-0205-6",
language = "English",
volume = "10",
journal = "Human-centric Computing and Information Sciences",
issn = "2192-1962",
publisher = "Springer Science + Business Media",
number = "1",

}

TY - JOUR

T1 - Developing an online hate classifier for multiple social media platforms

AU - Salminen, Joni

AU - Hopf, Maximilian

AU - Chowdhury, Shammur A.

AU - Jung, Soon gyo

AU - Almerekhi, Hind

AU - Jansen, Bernard J.

PY - 2020/12/1

Y1 - 2020/12/1

N2 - The proliferation of social media enables people to express their opinions widely online. However, at the same time, this has resulted in the emergence of conflict and hate, making online environments uninviting for users. Although researchers have found that hate is a problem across multiple platforms, there is a lack of models for online hate detection using multi-platform data. To address this research gap, we collect a total of 197,566 comments from four platforms: YouTube, Reddit, Wikipedia, and Twitter, with 80% of the comments labeled as non-hateful and the remaining 20% labeled as hateful. We then experiment with several classification algorithms (Logistic Regression, Naïve Bayes, Support Vector Machines, XGBoost, and Neural Networks) and feature representations (Bag-of-Words, TF-IDF, Word2Vec, BERT, and their combination). While all the models significantly outperform the keyword-based baseline classifier, XGBoost using all features performs the best (F1 = 0.92). Feature importance analysis indicates that BERT features are the most impactful for the predictions. Findings support the generalizability of the best model, as the platform-specific results from Twitter and Wikipedia are comparable to their respective source papers. We make our code publicly available for application in real software systems as well as for further development by online hate researchers.

AB - The proliferation of social media enables people to express their opinions widely online. However, at the same time, this has resulted in the emergence of conflict and hate, making online environments uninviting for users. Although researchers have found that hate is a problem across multiple platforms, there is a lack of models for online hate detection using multi-platform data. To address this research gap, we collect a total of 197,566 comments from four platforms: YouTube, Reddit, Wikipedia, and Twitter, with 80% of the comments labeled as non-hateful and the remaining 20% labeled as hateful. We then experiment with several classification algorithms (Logistic Regression, Naïve Bayes, Support Vector Machines, XGBoost, and Neural Networks) and feature representations (Bag-of-Words, TF-IDF, Word2Vec, BERT, and their combination). While all the models significantly outperform the keyword-based baseline classifier, XGBoost using all features performs the best (F1 = 0.92). Feature importance analysis indicates that BERT features are the most impactful for the predictions. Findings support the generalizability of the best model, as the platform-specific results from Twitter and Wikipedia are comparable to their respective source papers. We make our code publicly available for application in real software systems as well as for further development by online hate researchers.

KW - Machine learning

KW - Online hate

KW - Social media

KW - Toxicity

UR - http://www.scopus.com/inward/record.url?scp=85077201223&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85077201223&partnerID=8YFLogxK

U2 - 10.1186/s13673-019-0205-6

DO - 10.1186/s13673-019-0205-6

M3 - Article

AN - SCOPUS:85077201223

VL - 10

JO - Human-centric Computing and Information Sciences

JF - Human-centric Computing and Information Sciences

SN - 2192-1962

IS - 1

M1 - 1

ER -