Automated hate speech detection and the problem of offensive language

Thomas Davidson, Dana Warmsley, Michael Macy, Ingmar Weber

Research output: Chapter in Book/Report/Conference proceedingConference contribution

56 Citations (Scopus)

Abstract

A key challenge for automatic hate-speech detection on social media is the separation of hate speech from other instances of offensive language. Lexical detection methods tend to have low precision because they classify all messages containing particular terms as hate speech and previous work using supervised learning has failed to distinguish between the two categories.We used a crowd-sourced hate speech lexicon to collect tweets containing hate speech keywords. We use crowd-sourcing to label a sample of these tweets into three categories: those containing hate speech, only offensive language, and those with neither. We train a multi-class classifier to distinguish between these different categories. Close analysis of the predictions and the errors shows when we can reliably separate hate speech from other offensive language and when this differentiation is more difficult. We find that racist and homophobic tweets are more likely to be classified as hate speech but that sexist tweets are generally classified as offensive. Tweets without explicit hate keywords are also more difficult to classify.

Original languageEnglish
Title of host publicationProceedings of the 11th International Conference on Web and Social Media, ICWSM 2017
PublisherAAAI press
Pages512-515
Number of pages4
ISBN (Electronic)9781577357889
Publication statusPublished - 1 Jan 2017
Event11th International Conference on Web and Social Media, ICWSM 2017 - Montreal, Canada
Duration: 15 May 201718 May 2017

Other

Other11th International Conference on Web and Social Media, ICWSM 2017
CountryCanada
CityMontreal
Period15/5/1718/5/17

Fingerprint

Supervised learning
Labels
Classifiers

ASJC Scopus subject areas

  • Computer Networks and Communications

Cite this

Davidson, T., Warmsley, D., Macy, M., & Weber, I. (2017). Automated hate speech detection and the problem of offensive language. In Proceedings of the 11th International Conference on Web and Social Media, ICWSM 2017 (pp. 512-515). AAAI press.

Automated hate speech detection and the problem of offensive language. / Davidson, Thomas; Warmsley, Dana; Macy, Michael; Weber, Ingmar.

Proceedings of the 11th International Conference on Web and Social Media, ICWSM 2017. AAAI press, 2017. p. 512-515.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Davidson, T, Warmsley, D, Macy, M & Weber, I 2017, Automated hate speech detection and the problem of offensive language. in Proceedings of the 11th International Conference on Web and Social Media, ICWSM 2017. AAAI press, pp. 512-515, 11th International Conference on Web and Social Media, ICWSM 2017, Montreal, Canada, 15/5/17.
Davidson T, Warmsley D, Macy M, Weber I. Automated hate speech detection and the problem of offensive language. In Proceedings of the 11th International Conference on Web and Social Media, ICWSM 2017. AAAI press. 2017. p. 512-515
Davidson, Thomas ; Warmsley, Dana ; Macy, Michael ; Weber, Ingmar. / Automated hate speech detection and the problem of offensive language. Proceedings of the 11th International Conference on Web and Social Media, ICWSM 2017. AAAI press, 2017. pp. 512-515
@inproceedings{0d12c40ad82f47ddad3f5b77f3615d18,
title = "Automated hate speech detection and the problem of offensive language",
abstract = "A key challenge for automatic hate-speech detection on social media is the separation of hate speech from other instances of offensive language. Lexical detection methods tend to have low precision because they classify all messages containing particular terms as hate speech and previous work using supervised learning has failed to distinguish between the two categories.We used a crowd-sourced hate speech lexicon to collect tweets containing hate speech keywords. We use crowd-sourcing to label a sample of these tweets into three categories: those containing hate speech, only offensive language, and those with neither. We train a multi-class classifier to distinguish between these different categories. Close analysis of the predictions and the errors shows when we can reliably separate hate speech from other offensive language and when this differentiation is more difficult. We find that racist and homophobic tweets are more likely to be classified as hate speech but that sexist tweets are generally classified as offensive. Tweets without explicit hate keywords are also more difficult to classify.",
author = "Thomas Davidson and Dana Warmsley and Michael Macy and Ingmar Weber",
year = "2017",
month = "1",
day = "1",
language = "English",
pages = "512--515",
booktitle = "Proceedings of the 11th International Conference on Web and Social Media, ICWSM 2017",
publisher = "AAAI press",

}

TY - GEN

T1 - Automated hate speech detection and the problem of offensive language

AU - Davidson, Thomas

AU - Warmsley, Dana

AU - Macy, Michael

AU - Weber, Ingmar

PY - 2017/1/1

Y1 - 2017/1/1

N2 - A key challenge for automatic hate-speech detection on social media is the separation of hate speech from other instances of offensive language. Lexical detection methods tend to have low precision because they classify all messages containing particular terms as hate speech and previous work using supervised learning has failed to distinguish between the two categories.We used a crowd-sourced hate speech lexicon to collect tweets containing hate speech keywords. We use crowd-sourcing to label a sample of these tweets into three categories: those containing hate speech, only offensive language, and those with neither. We train a multi-class classifier to distinguish between these different categories. Close analysis of the predictions and the errors shows when we can reliably separate hate speech from other offensive language and when this differentiation is more difficult. We find that racist and homophobic tweets are more likely to be classified as hate speech but that sexist tweets are generally classified as offensive. Tweets without explicit hate keywords are also more difficult to classify.

AB - A key challenge for automatic hate-speech detection on social media is the separation of hate speech from other instances of offensive language. Lexical detection methods tend to have low precision because they classify all messages containing particular terms as hate speech and previous work using supervised learning has failed to distinguish between the two categories.We used a crowd-sourced hate speech lexicon to collect tweets containing hate speech keywords. We use crowd-sourcing to label a sample of these tweets into three categories: those containing hate speech, only offensive language, and those with neither. We train a multi-class classifier to distinguish between these different categories. Close analysis of the predictions and the errors shows when we can reliably separate hate speech from other offensive language and when this differentiation is more difficult. We find that racist and homophobic tweets are more likely to be classified as hate speech but that sexist tweets are generally classified as offensive. Tweets without explicit hate keywords are also more difficult to classify.

UR - http://www.scopus.com/inward/record.url?scp=85026777430&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85026777430&partnerID=8YFLogxK

M3 - Conference contribution

SP - 512

EP - 515

BT - Proceedings of the 11th International Conference on Web and Social Media, ICWSM 2017

PB - AAAI press

ER -