Investigating and Addressing Hallucinations of LLMs in Tasks Involving Negation

Type: Preprint

Publication Date: 2024-06-08

Citations: 0

DOI: https://doi.org/10.48550/arxiv.2406.05494

Abstract

Large Language Models (LLMs) have achieved remarkable performance across a wide variety of natural language tasks. However, they have been shown to suffer from a critical limitation pertinent to 'hallucination' in their output. Recent research has focused on investigating and addressing this problem for a variety of tasks such as biography generation, question answering, abstractive summarization, and dialogue generation. However, the crucial aspect pertaining to 'negation' has remained considerably underexplored. Negation is important because it adds depth and nuance to the understanding of language and is also crucial for logical reasoning and inference. In this work, we address the above limitation and particularly focus on studying the impact of negation in LLM hallucinations. Specifically, we study four tasks with negation: 'false premise completion', 'constrained fact generation', 'multiple choice question answering', and 'fact generation'. We show that open-source state-of-the-art LLMs such as LLaMA-2-chat, Vicuna, and Orca-2 hallucinate considerably on all these tasks involving negation which underlines a critical shortcoming of these models. Addressing this problem, we further study numerous strategies to mitigate these hallucinations and demonstrate their impact.

Locations

  • arXiv (Cornell University) - View - PDF

Similar Works

Action Title Year Authors
+ Chain-of-Verification Reduces Hallucination in Large Language Models 2023 Shehzaad Dhuliawala
Mojtaba Komeili
Jing Xu
Roberta Raileanu
Xian Li
Aslı Çelikyılmaz
Jason Weston
+ PDF Chat This is not a Dataset: A Large Negation Benchmark to Challenge Large Language Models 2023 Iker GarcĂ­a-Ferrero
Begoña Altuna
Javier Álvez
Itziar GonzĂĄlez-Dios
GermĂĄn Rigau
+ This is not a Dataset: A Large Negation Benchmark to Challenge Large Language Models 2023 Iker GarcĂ­a-Ferrero
Begoña Altuna
Javier Álvez
Itziar GonzĂĄlez-Dios
GermĂĄn Rigau
+ Say What You Mean! Large Language Models Speak Too Positively about Negative Commonsense Knowledge 2023 Jiangjie Chen
Wei Shi
Ziquan Fu
Sijie Cheng
Lei Li
Yanghua Xiao
+ Say What You Mean! Large Language Models Speak Too Positively about Negative Commonsense Knowledge 2023 Jiangjie Chen
Wei Shi
Ziquan Fu
Sijie Cheng
Lei Li
Yanghua Xiao
+ PDF Chat The Hallucinations Leaderboard -- An Open Effort to Measure Hallucinations in Large Language Models 2024 Giwon Hong
Aryo Pradipta Gema
Rohit Saxena
Xiaotang Du
Ping Nie
Yu Zhao
Laura Perez-Beltrachini
Max Ryabinin
Xuanli He
Pasquale Minervini
+ CONDAQA: A Contrastive Reading Comprehension Dataset for Reasoning about Negation 2022 Abhilasha Ravichander
Matt Gardner
Ana Marasović
+ PDF Chat CONDAQA: A Contrastive Reading Comprehension Dataset for Reasoning about Negation 2022 Abhilasha Ravichander
Matt Gardner
Ana Marasović
+ The Pitfalls of Defining Hallucination 2024 Kees van Deemter
+ DelucionQA: Detecting Hallucinations in Domain-specific Question Answering 2023 Mobashir Sadat
Zhengyu Zhou
Lukas Lange
Jun Araki
Arsalan Gundroo
Bingqing Wang
Rakesh R. Menon
Md Rizwan Parvez
Zhe Feng
+ DelucionQA: Detecting Hallucinations in Domain-specific Question Answering 2023 Mobashir Sadat
Zhengyu Zhou
Lukas Lange
Jun Araki
Arsalan Gundroo
Bingqing Wang
Rakesh R. Menon
Md Rizwan Parvez
Zhe Feng
+ PDF Chat Generating Diverse Negations from Affirmative Sentences 2024 Darian Rodriguez Vasquez
Afroditi Papadaki
+ PDF Chat KnowHalu: Hallucination Detection via Multi-Form Knowledge Based Factual Checking 2024 Jiawei Zhang
Chejian Xu
Yu Gai
Freddy Lécué
Dawn Song
Bo Li
+ PDF Chat The Pitfalls of Defining Hallucination 2024 Kees van Deemter
+ PDF Chat Cost-Effective Hallucination Detection for LLMs 2024 Simon Valentin
Jinmiao Fu
Gianluca Detommaso
Shaoyuan Xu
Giovanni Zappella
Bryan Wang
+ PDF Chat Logical Consistency of Large Language Models in Fact-checking 2024 Bishwamittra Ghosh
Sarah Hasan
Naheed Anjum Arafat
Arijit Khan
+ With a Little Push, NLI Models can Robustly and Efficiently Predict Faithfulness 2023 Julius Steen
Juri Opitz
Anette Frank
Katja Markert
+ With a Little Push, NLI Models can Robustly and Efficiently Predict Faithfulness 2023 Julius Steen
Juri Opitz
Anette Frank
Katja Markert
+ PDF Chat Iter-AHMCL: Alleviate Hallucination for Large Language Model via Iterative Model-level Contrastive Learning 2024 Huiwen Wu
Xiaohan Li
Xiaogang Xu
Jiafei Wu
Deyi Zhang
Zhe Liu
+ ScoNe: Benchmarking Negation Reasoning in Language Models With Fine-Tuning and In-Context Learning 2023 Jingyuan Selena She
Christopher Potts
Samuel R. Bowman
Atticus Geiger

Works That Cite This (0)

Action Title Year Authors

Works Cited by This (0)

Action Title Year Authors