COT: A Generative Approach for Hate Speech Counter-Narratives via Contrastive Optimal Transport

Type: Preprint

Publication Date: 2024-06-18

Citations: 0

DOI: https://doi.org/10.48550/arxiv.2406.12304

Abstract

Counter-narratives, which are direct responses consisting of non-aggressive fact-based arguments, have emerged as a highly effective approach to combat the proliferation of hate speech. Previous methodologies have primarily focused on fine-tuning and post-editing techniques to ensure the fluency of generated contents, while overlooking the critical aspects of individualization and relevance concerning the specific hatred targets, such as LGBT groups, immigrants, etc. This research paper introduces a novel framework based on contrastive optimal transport, which effectively addresses the challenges of maintaining target interaction and promoting diversification in generating counter-narratives. Firstly, an Optimal Transport Kernel (OTK) module is leveraged to incorporate hatred target information in the token representations, in which the comparison pairs are extracted between original and transported features. Secondly, a self-contrastive learning module is employed to address the issue of model degeneration. This module achieves this by generating an anisotropic distribution of token representations. Finally, a target-oriented search method is integrated as an improved decoding strategy to explicitly promote domain relevance and diversification in the inference process. This strategy modifies the model's confidence score by considering both token similarity and target relevance. Quantitative and qualitative experiments have been evaluated on two benchmark datasets, which demonstrate that our proposed model significantly outperforms current methods evaluated by metrics from multiple aspects.

Locations

  • arXiv (Cornell University) - View - PDF

Similar Works

Action Title Year Authors
+ RAUCG: Retrieval-Augmented Unsupervised Counter Narrative Generation for Hate Speech 2023 Shuyu Jiang
Wenyi Tang
Xingshu Chen
Rui Tanga
Haizhou Wang
Wenxian Wang
+ Weigh Your Own Words: Improving Hate Speech Counter Narrative Generation via Attention Regularization 2023 Helena Bonaldi
Giuseppe Attanasio
Debora Nozza
Marco Guerini
+ PDF Chat CounterGeDi: A Controllable Approach to Generate Polite, Detoxified and Emotional Counterspeech 2022 Punyajoy Saha
Kanishk Singh
Adarsh Kumar
Binny Mathew
Animesh Mukherjee
+ CounterGeDi: A controllable approach to generate polite, detoxified and emotional counterspeech 2022 Punyajoy Saha
Kanishk Singh
Adarsh Kumar
Binny Mathew
Animesh Mukherjee
+ Generate, Prune, Select: A Pipeline for Counterspeech Generation against Online Hate Speech 2021 Wanzheng Zhu
Suma Bhat
+ PDF Chat A Multi-Aspect Framework for Counter Narrative Evaluation using Large Language Models 2024 Jaylen Jones
Lingbo Mo
Eric Fosler‐Lussier
Huan Sun
+ PDF Chat Alternative Speech: Complementary Method to Counter-Narrative for Better Discourse 2024 Seungyoon Lee
Dahyun Jung
Chanjun Park
Seolhwa Lee
Heuiseok Lim
+ Generate, Prune, Select: A Pipeline for Counterspeech Generation against Online Hate Speech 2021 Wanzheng Zhu
Suma Bhat
+ PDF Chat Using Pre-Trained Language Models for Producing Counter Narratives Against Hate Speech: a Comparative Study 2022 Serra Sinem Tekiroğlu
Helena Bonaldi
Margherita Fanton
Marco Guerini
+ Towards Knowledge-Grounded Counter Narrative Generation for Hate Speech 2021 Yi-Ling Chung
Serra Sinem Tekiroğlu
Marco Guerini
+ Towards Knowledge-Grounded Counter Narrative Generation for Hate Speech 2021 Yi-Ling Chung
Serra Sinem Tekiroğlu
Marco Guerini
+ Using Pre-Trained Language Models for Producing Counter Narratives Against Hate Speech: a Comparative Study 2022 Serra Sinem Tekiroğlu
Helena Bonaldi
Margherita Fanton
Marco Guerini
+ Is Safer Better? The Impact of Guardrails on the Argumentative Strength of LLMs in Hate Speech Countering 2024 Helena Bonaldi
Greta Damo
Nicolás Benjamín Ocampo
Elena Cabrio
Serena Villata
Marco Guerini
+ PDF Chat Alternative Speech: Complementary Method to Counter-Narrative for Better Discourse 2023 Seungyoon Lee
Da-Hyun Jung
Chanjun Park
Seolhwa Lee
Heuiseok Lim
+ PDF Chat Outcome-Constrained Large Language Models for Countering Hate Speech 2024 Lingzi Hong
Pengcheng Luo
Eduardo Blanco
Xiaoying Song
+ PDF Chat FOCUS: Forging Originality through Contrastive Use in Self-Plagiarism for Language Models 2024 Kaixin Lan
Fang Tao
Derek Wong
Yabo Xu
Lidia S. Chao
Cecilia G. Zhao
+ PDF Chat Tox-BART: Leveraging Toxicity Attributes for Explanation Generation of Implicit Hate Speech 2024 Neemesh Yadav
Sarah Masud
Vikram Goyal
Vikram Goyal
Md Shad Akhtar
Tanmoy Chakraborty
+ Decoding ChatGPT: A taxonomy of existing research, current challenges, and possible future directions 2023 Shahab Saquib Sohail
Faiza Farhat
Yassine Himeur
Mohammad Nadeem
Dag Øivind Madsen
Yashbir Singh
Shadi Atalla
Wathiq Mansoor
+ Decoding ChatGPT: A Taxonomy of Existing Research, Current Challenges, and Possible Future Directions 2023 Shahab Saquib Sohail
Faiza Farhat
Yassine Himeur
Mohammad Nadeem
Dag Øivind Madsen
Yashbir Singh
Shadi Atalla
Wathiq Mansoor
+ PDF Chat Intent-conditioned and Non-toxic Counterspeech Generation using Multi-Task Instruction Tuning with RLAIF 2024 Amey Hengle
Aswini Kumar
Sahajpreet Singh
Anil Bandhakavi
Md Shad Akhtar
Tanmoy Chakroborty

Works That Cite This (0)

Action Title Year Authors

Works Cited by This (0)

Action Title Year Authors