Using Pre-Trained Language Models for Producing Counter Narratives Against Hate Speech: a Comparative Study

Type: Article

Publication Date: 2022-01-01

Citations: 7

DOI: https://doi.org/10.18653/v1/2022.findings-acl.245

Abstract

In this work, we present an extensive study on the use of pre-trained language models for the task of automatic Counter Narrative (CN) generation to fight online hate speech in English. We first present a comparative study to determine whether there is a particular Language Model (or class of LMs) and a particular decoding mechanism that are the most appropriate to generate CNs. Findings show that autoregressive models combined with stochastic decodings are the most promising. We then investigate how an LM performs in generating a CN with regard to an unseen target of hate. We find out that a key element for successful ‘out of target’ experiments is not an overall similarity with the training data but the presence of a specific subset of training data, i. e. a target that shares some commonalities with the test target that can be defined a-priori. We finally introduce the idea of a pipeline based on the addition of an automatic post-editing step to refine generated CNs.

Locations

  • arXiv (Cornell University) - View - PDF
  • Findings of the Association for Computational Linguistics: ACL 2022 - View - PDF
  • Iris (University of Trento) - View - PDF

Similar Works

Action Title Year Authors
+ Using Pre-Trained Language Models for Producing Counter Narratives Against Hate Speech: a Comparative Study 2022 Serra Sinem Tekiroğlu
Helena Bonaldi
Margherita Fanton
Marco Guerini
+ Weigh Your Own Words: Improving Hate Speech Counter Narrative Generation via Attention Regularization 2023 Helena Bonaldi
Giuseppe Attanasio
Debora Nozza
Marco Guerini
+ PDF Chat Harnessing Artificial Intelligence to Combat Online Hate: Exploring the Challenges and Opportunities of Large Language Models in Hate Speech Detection 2024 Tharindu Kumarage
Amrita Bhattacharjee
Joshua Garland
+ Generating Counter Narratives against Online Hate Speech: Data and Strategies 2020 Serra Sinem Tekiroğlu
Yi-Ling Chung
Marco Guerini
+ Generating Counter Narratives against Online Hate Speech: Data and Strategies 2020 Serra Sinem Tekiroğlu
Yi-Ling Chung
Marco Guerini
+ Generating Counter Narratives against Online Hate Speech: Data and Strategies 2020 Serra Sinem Tekiroğlu
Yi-Ling Chung
Marco Guerini
+ PDF Chat An Investigation of Large Language Models for Real-World Hate Speech Detection 2023 Keyan Guo
Alexander Hu
Jaden Mu
Ziheng Shi
Ziming Zhao
Nishant Vishwamitra
Hongxin Hu
+ PDF Chat Decoding Hate: Exploring Language Models' Reactions to Hate Speech 2024 P Piot
Javier Parapar
+ PDF Chat Detecting Anti-Semitic Hate Speech using Transformer-based Large Language Models 2024 D. Liu
Minghao Wang
Andrew G. Catlin
+ PDF Chat Probing Critical Learning Dynamics of PLMs for Hate Speech Detection 2024 Sarah Masud
Mohammad Aflah Khan
Vikram Goyal
Md Shad Akhtar
Tanmoy Chakraborty
+ Generative AI for Hate Speech Detection: Evaluation and Findings 2023 Sagi Pendzel
Tomer Wullach
Amir Adler
Einat Minkov
+ HateCheckHIn: Evaluating Hindi Hate Speech Detection Models 2022 Mithun Kumar Das
Punyajoy Saha
Binny Mathew
Animesh Mukherjee
+ Evaluating ChatGPT's Performance for Multilingual and Emoji-based Hate Speech Detection 2023 Mithun Das
Saurabh Pandey
Animesh Mukherjee
+ PDF Chat Model-Agnostic Meta-Learning for Multilingual Hate Speech Detection 2023 Md. Rabiul Awal
Roy Ka-Wei Lee
Eshaan Tanwar
Tanmay Garg
Tanmoy Chakraborty
+ Model-Agnostic Meta-Learning for Multilingual Hate Speech Detection 2023 Md Rabiul Awal
Roy Ka-Wei Lee
Eshaan Tanwar
Tanmay Garg
Tanmoy Chakraborty
+ PDF Chat Towards Interpretable Hate Speech Detection using Large Language Model-extracted Rationales 2024 Ayushi Nirmal
Amrita Bhattacharjee
Paras Sheth
Huan Liu
+ Probing LLMs for hate speech detection: strengths and vulnerabilities 2023 Sarthak Roy
Ashish Harshavardhan
Animesh Mukherjee
Punyajoy Saha
+ Probing LLMs for hate speech detection: strengths and vulnerabilities 2023 Sarthak Roy
A Venkata Harshvardhan
Animesh Mukherjee
Punyajoy Saha
+ PDF Chat GPT-HateCheck: Can LLMs Write Better Functional Tests for Hate Speech Detection? 2024 Yiping Jin
Leo Wanner
Alexander Shvets
+ PDF Chat Towards Efficient and Explainable Hate Speech Detection via Model Distillation 2024 P Piot
Javier Parapar