COT: A Generative Approach for Hate Speech Counter-Narratives via Contrastive Optimal Transport

Linhao Zhang, Jin Li, Guangluan Xu, Xiaoyu Li, Xian Sun

Type: Preprint

Publication Date: 2024-06-18

Citations: 0

DOI: https://doi.org/10.48550/arxiv.2406.12304

Abstract

Counter-narratives, which are direct responses consisting of non-aggressive fact-based arguments, have emerged as a highly effective approach to combat the proliferation of hate speech. Previous methodologies have primarily focused on fine-tuning and post-editing techniques to ensure the fluency of generated contents, while overlooking the critical aspects of individualization and relevance concerning the specific hatred targets, such as LGBT groups, immigrants, etc. This research paper introduces a novel framework based on contrastive optimal transport, which effectively addresses the challenges of maintaining target interaction and promoting diversification in generating counter-narratives. Firstly, an Optimal Transport Kernel (OTK) module is leveraged to incorporate hatred target information in the token representations, in which the comparison pairs are extracted between original and transported features. Secondly, a self-contrastive learning module is employed to address the issue of model degeneration. This module achieves this by generating an anisotropic distribution of token representations. Finally, a target-oriented search method is integrated as an improved decoding strategy to explicitly promote domain relevance and diversification in the inference process. This strategy modifies the model's confidence score by considering both token similarity and target relevance. Quantitative and qualitative experiments have been evaluated on two benchmark datasets, which demonstrate that our proposed model significantly outperforms current methods evaluated by metrics from multiple aspects.

Locations

arXiv (Cornell University) - View - PDF

Similar Works

Action	Title	Year	Authors
+	RAUCG: Retrieval-Augmented Unsupervised Counter Narrative Generation for Hate Speech	2023	Shuyu Jiang Wenyi Tang Xingshu Chen Rui Tanga Haizhou Wang Wenxian Wang
+	Weigh Your Own Words: Improving Hate Speech Counter Narrative Generation via Attention Regularization	2023	Helena Bonaldi Giuseppe Attanasio Debora Nozza Marco Guerini
+ PDF Chat	CounterGeDi: A Controllable Approach to Generate Polite, Detoxified and Emotional Counterspeech	2022	Punyajoy Saha Kanishk Singh Adarsh Kumar Binny Mathew Animesh Mukherjee
+	CounterGeDi: A controllable approach to generate polite, detoxified and emotional counterspeech	2022	Punyajoy Saha Kanishk Singh Adarsh Kumar Binny Mathew Animesh Mukherjee
+	Generate, Prune, Select: A Pipeline for Counterspeech Generation against Online Hate Speech	2021	Wanzheng Zhu Suma Bhat
+ PDF Chat	A Multi-Aspect Framework for Counter Narrative Evaluation using Large Language Models	2024	Jaylen Jones Lingbo Mo Eric Fosler‐Lussier Huan Sun
+ PDF Chat	Alternative Speech: Complementary Method to Counter-Narrative for Better Discourse	2024	Seungyoon Lee Dahyun Jung Chanjun Park Seolhwa Lee Heuiseok Lim
+	Generate, Prune, Select: A Pipeline for Counterspeech Generation against Online Hate Speech	2021	Wanzheng Zhu Suma Bhat
+ PDF Chat	Using Pre-Trained Language Models for Producing Counter Narratives Against Hate Speech: a Comparative Study	2022	Serra Sinem Tekiroğlu Helena Bonaldi Margherita Fanton Marco Guerini
+	Towards Knowledge-Grounded Counter Narrative Generation for Hate Speech	2021	Yi-Ling Chung Serra Sinem Tekiroğlu Marco Guerini
+	Towards Knowledge-Grounded Counter Narrative Generation for Hate Speech	2021	Yi-Ling Chung Serra Sinem Tekiroğlu Marco Guerini
+	Using Pre-Trained Language Models for Producing Counter Narratives Against Hate Speech: a Comparative Study	2022	Serra Sinem Tekiroğlu Helena Bonaldi Margherita Fanton Marco Guerini
+	Is Safer Better? The Impact of Guardrails on the Argumentative Strength of LLMs in Hate Speech Countering	2024	Helena Bonaldi Greta Damo Nicolás Benjamín Ocampo Elena Cabrio Serena Villata Marco Guerini
+ PDF Chat	Alternative Speech: Complementary Method to Counter-Narrative for Better Discourse	2023	Seungyoon Lee Da-Hyun Jung Chanjun Park Seolhwa Lee Heuiseok Lim
+ PDF Chat	Outcome-Constrained Large Language Models for Countering Hate Speech	2024	Lingzi Hong Pengcheng Luo Eduardo Blanco Xiaoying Song
+ PDF Chat	FOCUS: Forging Originality through Contrastive Use in Self-Plagiarism for Language Models	2024	Kaixin Lan Fang Tao Derek Wong Yabo Xu Lidia S. Chao Cecilia G. Zhao
+ PDF Chat	Tox-BART: Leveraging Toxicity Attributes for Explanation Generation of Implicit Hate Speech	2024	Neemesh Yadav Sarah Masud Vikram Goyal Vikram Goyal Md Shad Akhtar Tanmoy Chakraborty
+	Decoding ChatGPT: A taxonomy of existing research, current challenges, and possible future directions	2023	Shahab Saquib Sohail Faiza Farhat Yassine Himeur Mohammad Nadeem Dag Øivind Madsen Yashbir Singh Shadi Atalla Wathiq Mansoor
+	Decoding ChatGPT: A Taxonomy of Existing Research, Current Challenges, and Possible Future Directions	2023	Shahab Saquib Sohail Faiza Farhat Yassine Himeur Mohammad Nadeem Dag Øivind Madsen Yashbir Singh Shadi Atalla Wathiq Mansoor
+ PDF Chat	Intent-conditioned and Non-toxic Counterspeech Generation using Multi-Task Instruction Tuning with RLAIF	2024	Amey Hengle Aswini Kumar Sahajpreet Singh Anil Bandhakavi Md Shad Akhtar Tanmoy Chakroborty

Works That Cite This (0)

Action	Title	Year	Authors

Works Cited by This (0)

Action	Title	Year	Authors