Neural network hate deletion

Developing a machine learning model to eliminate hate from online comments

Joni Salminen, Juhani Luotolahti, Hind Almerekhi, Bernard Jansen, Soon Gyo Jung

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

We propose a method for modifying hateful online comments to non-hateful comments without losing the understandability and original meaning of the comments. To accomplish this, we retrieve and classify 301,153 hateful and 1,041,490 non-hateful comments from Facebook and YouTube channels of a large international media organization that is a target of considerable online hate. We supplement this dataset by 10,000 Reddit comments manually labeled for hatefulness. Using these two datasets, we train a neural network to distinguish linguistic patterns. The model we develop, Neural Network Hate Deletion (NNHD), computes how hateful the sentences of a social media comment are and if they are above a given threshold, it deletes them using a language dependency tree. We evaluate the results by comparing crowd workers’ perceptions of hatefulness and understandability before and after transformation and find that our method reduces hatefulness without resulting in a significant loss of understandability. In some cases, removing hateful elements improves understandability by reducing the linguistic complexity of the comment. In addition, we find that NNHD can satisfactorily retain the original meaning on average but is not perfect in this regard. In terms of practical implications, NNHD could be used in social media platforms to suggest more neutral use of language to agitated online users.

Original languageEnglish
Title of host publicationInternet Science - 5th International Conference, INSCI 2018, Proceedings
EditorsSvetlana S. Bodrunova
PublisherSpringer Verlag
Pages25-39
Number of pages15
ISBN (Print)9783030014360
DOIs
Publication statusPublished - 1 Jan 2018
Event5th International Conference on Internet Science, INSCI 2018 - St. Petersburg, Russian Federation
Duration: 24 Oct 201826 Oct 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11193 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other5th International Conference on Internet Science, INSCI 2018
CountryRussian Federation
CitySt. Petersburg
Period24/10/1826/10/18

Fingerprint

Deletion
Learning systems
Machine Learning
Eliminate
Neural Networks
Neural networks
Social Media
Linguistics
Model
Classify
Target
Evaluate
Meaning
Language

Keywords

  • Hate deletion
  • Neural networks
  • Online hate
  • Toxic comments

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Salminen, J., Luotolahti, J., Almerekhi, H., Jansen, B., & Jung, S. G. (2018). Neural network hate deletion: Developing a machine learning model to eliminate hate from online comments. In S. S. Bodrunova (Ed.), Internet Science - 5th International Conference, INSCI 2018, Proceedings (pp. 25-39). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11193 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-030-01437-7_3

Neural network hate deletion : Developing a machine learning model to eliminate hate from online comments. / Salminen, Joni; Luotolahti, Juhani; Almerekhi, Hind; Jansen, Bernard; Jung, Soon Gyo.

Internet Science - 5th International Conference, INSCI 2018, Proceedings. ed. / Svetlana S. Bodrunova. Springer Verlag, 2018. p. 25-39 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11193 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Salminen, J, Luotolahti, J, Almerekhi, H, Jansen, B & Jung, SG 2018, Neural network hate deletion: Developing a machine learning model to eliminate hate from online comments. in SS Bodrunova (ed.), Internet Science - 5th International Conference, INSCI 2018, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11193 LNCS, Springer Verlag, pp. 25-39, 5th International Conference on Internet Science, INSCI 2018, St. Petersburg, Russian Federation, 24/10/18. https://doi.org/10.1007/978-3-030-01437-7_3
Salminen J, Luotolahti J, Almerekhi H, Jansen B, Jung SG. Neural network hate deletion: Developing a machine learning model to eliminate hate from online comments. In Bodrunova SS, editor, Internet Science - 5th International Conference, INSCI 2018, Proceedings. Springer Verlag. 2018. p. 25-39. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-030-01437-7_3
Salminen, Joni ; Luotolahti, Juhani ; Almerekhi, Hind ; Jansen, Bernard ; Jung, Soon Gyo. / Neural network hate deletion : Developing a machine learning model to eliminate hate from online comments. Internet Science - 5th International Conference, INSCI 2018, Proceedings. editor / Svetlana S. Bodrunova. Springer Verlag, 2018. pp. 25-39 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{500a424ec2e44ea6b14a55fc91ce0c03,
title = "Neural network hate deletion: Developing a machine learning model to eliminate hate from online comments",
abstract = "We propose a method for modifying hateful online comments to non-hateful comments without losing the understandability and original meaning of the comments. To accomplish this, we retrieve and classify 301,153 hateful and 1,041,490 non-hateful comments from Facebook and YouTube channels of a large international media organization that is a target of considerable online hate. We supplement this dataset by 10,000 Reddit comments manually labeled for hatefulness. Using these two datasets, we train a neural network to distinguish linguistic patterns. The model we develop, Neural Network Hate Deletion (NNHD), computes how hateful the sentences of a social media comment are and if they are above a given threshold, it deletes them using a language dependency tree. We evaluate the results by comparing crowd workers’ perceptions of hatefulness and understandability before and after transformation and find that our method reduces hatefulness without resulting in a significant loss of understandability. In some cases, removing hateful elements improves understandability by reducing the linguistic complexity of the comment. In addition, we find that NNHD can satisfactorily retain the original meaning on average but is not perfect in this regard. In terms of practical implications, NNHD could be used in social media platforms to suggest more neutral use of language to agitated online users.",
keywords = "Hate deletion, Neural networks, Online hate, Toxic comments",
author = "Joni Salminen and Juhani Luotolahti and Hind Almerekhi and Bernard Jansen and Jung, {Soon Gyo}",
year = "2018",
month = "1",
day = "1",
doi = "10.1007/978-3-030-01437-7_3",
language = "English",
isbn = "9783030014360",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "25--39",
editor = "Bodrunova, {Svetlana S.}",
booktitle = "Internet Science - 5th International Conference, INSCI 2018, Proceedings",

}

TY - GEN

T1 - Neural network hate deletion

T2 - Developing a machine learning model to eliminate hate from online comments

AU - Salminen, Joni

AU - Luotolahti, Juhani

AU - Almerekhi, Hind

AU - Jansen, Bernard

AU - Jung, Soon Gyo

PY - 2018/1/1

Y1 - 2018/1/1

N2 - We propose a method for modifying hateful online comments to non-hateful comments without losing the understandability and original meaning of the comments. To accomplish this, we retrieve and classify 301,153 hateful and 1,041,490 non-hateful comments from Facebook and YouTube channels of a large international media organization that is a target of considerable online hate. We supplement this dataset by 10,000 Reddit comments manually labeled for hatefulness. Using these two datasets, we train a neural network to distinguish linguistic patterns. The model we develop, Neural Network Hate Deletion (NNHD), computes how hateful the sentences of a social media comment are and if they are above a given threshold, it deletes them using a language dependency tree. We evaluate the results by comparing crowd workers’ perceptions of hatefulness and understandability before and after transformation and find that our method reduces hatefulness without resulting in a significant loss of understandability. In some cases, removing hateful elements improves understandability by reducing the linguistic complexity of the comment. In addition, we find that NNHD can satisfactorily retain the original meaning on average but is not perfect in this regard. In terms of practical implications, NNHD could be used in social media platforms to suggest more neutral use of language to agitated online users.

AB - We propose a method for modifying hateful online comments to non-hateful comments without losing the understandability and original meaning of the comments. To accomplish this, we retrieve and classify 301,153 hateful and 1,041,490 non-hateful comments from Facebook and YouTube channels of a large international media organization that is a target of considerable online hate. We supplement this dataset by 10,000 Reddit comments manually labeled for hatefulness. Using these two datasets, we train a neural network to distinguish linguistic patterns. The model we develop, Neural Network Hate Deletion (NNHD), computes how hateful the sentences of a social media comment are and if they are above a given threshold, it deletes them using a language dependency tree. We evaluate the results by comparing crowd workers’ perceptions of hatefulness and understandability before and after transformation and find that our method reduces hatefulness without resulting in a significant loss of understandability. In some cases, removing hateful elements improves understandability by reducing the linguistic complexity of the comment. In addition, we find that NNHD can satisfactorily retain the original meaning on average but is not perfect in this regard. In terms of practical implications, NNHD could be used in social media platforms to suggest more neutral use of language to agitated online users.

KW - Hate deletion

KW - Neural networks

KW - Online hate

KW - Toxic comments

UR - http://www.scopus.com/inward/record.url?scp=85055785701&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85055785701&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-01437-7_3

DO - 10.1007/978-3-030-01437-7_3

M3 - Conference contribution

SN - 9783030014360

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 25

EP - 39

BT - Internet Science - 5th International Conference, INSCI 2018, Proceedings

A2 - Bodrunova, Svetlana S.

PB - Springer Verlag

ER -