MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions

Type: Article

Publication Date: 2023-01-01

Citations: 15

DOI: https://doi.org/10.18653/v1/2023.emnlp-main.971

Abstract

The information stored in large language models (LLMs) falls out of date quickly, and retraining from scratch is often not an option. This has recently given rise to a range of techniques for injecting new facts through updating model weights. Current evaluation paradigms are extremely limited, mainly validating the recall of edited facts, but changing one fact should cause rippling changes to the model's related beliefs. If we edit the UK Prime Minister to now be Rishi Sunak, then we should get a different answer to Who is married to the British Prime Minister? In this work, we present a benchmark MQuAKE (Multi-hop Question Answering for Knowledge Editing) comprising multi-hop questions that assess whether edited models correctly answer questions where the answer should change as an entailed consequence of edited facts. While we find that current knowledge-editing approaches can recall edited facts accurately, they fail catastrophically on the constructed multi-hop questions. We thus propose a simple memory-based approach, MeLLo, which stores all edited facts externally while prompting the language model iteratively to generate answers that are consistent with the edited facts. While MQuAKE remains challenging, we show that MeLLo scales well with LLMs (up to 175B) and outperforms previous model editors by a large margin.

Locations

  • arXiv (Cornell University) - View - PDF
  • Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing - View - PDF

Similar Works

Action Title Year Authors
+ MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions 2023 Zexuan Zhong
Zhengxuan Wu
Christopher D. Manning
Christopher Potts
Danqi Chen
+ PDF Chat Leveraging Logical Rules in Knowledge Editing: A Cherry on the Top 2024 Keyuan Cheng
Muhammad Ali
Shu Yang
Lin Gang
Yuxuan zhai
Haoyang Fei
Ke Xu
Lu Yu
Lijie Hu
Di Wang
+ PDF Chat Neighboring Perturbations of Knowledge Editing on Large Language Models 2024 Jun-Yu Ma
Jia-Chen Gu
Ningyu Zhang
Zhen-Hua Ling
+ PDF Chat Retrieval-Enhanced Knowledge Editing for Multi-Hop Question Answering in Language Models 2024 Yucheng Shi
Qiaoyu Tan
Xuansheng Wu
Shaochen Zhong
Kaixiong Zhou
Ninghao Liu
+ PDF Chat Retrieval-enhanced Knowledge Editing in Language Models for Multi-Hop Question Answering 2024 Yucheng Shi
Qiaoyu Tan
Xuansheng Wu
Shaochen Zhong
Kaixiong Zhou
Ninghao Liu
+ PDF Chat LLM-Based Multi-Hop Question Answering with Knowledge Graph Integration in Evolving Environments 2024 Ruirui Chen
Weifeng Jiang
Chengwei Qin
Ishaan Singh Rawal
Cheston Tan
Dongkyu Choi
Bo Xiong
Bo Ai
+ PDF Chat Joint Knowledge Editing for Information Enrichment and Probability Promotion 2024 Wenhang Shi
Yiren Chen
Shuqing Bian
Xinyi Zhang
Zhe Zhao
Pengfei Hu
Wei Lu
Xiaoyong Du
+ PDF Chat RIPPLECOT: Amplifying Ripple Effect of Knowledge Editing in Language Models via Chain-of-Thought In-Context Learning 2024 Zihao Zhao
Yuchen Yang
Yijiang Li
Yinzhi Cao
+ Unveiling the Pitfalls of Knowledge Editing for Large Language Models 2023 Zhoubo Li
Ningyu Zhang
Yunzhi Yao
Mengru Wang
Xi Chen
Huajun Chen
+ Memory-Based Model Editing at Scale 2022 Eric Mitchell
Charles P. Lin
Antoine Bosselut
Christopher D. Manning
Chelsea Finn
+ PDF Chat Knowledge Editing through Chain-of-Thought 2024 C. H. Wang
Weiliang Su
Qingyao Ai
Yiqun Liu
+ PDF Chat Editing the Mind of Giants: An In-Depth Exploration of Pitfalls of Knowledge Editing in Large Language Models 2024 Cheng-Hsun Hsueh
Paul Kuo-Ming Huang
Tzu-Han Lin
Che-Wei Liao
Hung-Chieh Fang
Chao-Wei Huang
Yun-Nung Chen
+ PDF Chat Navigating the Dual Facets: A Comprehensive Evaluation of Sequential Memory Editing in Large Language Models 2024 Zihao Lin
Mohammad Beigi
Hongxuan Li
Yufan Zhou
Yuxiang Zhang
Qifan Wang
Wenpeng Yin
Lifu Huang
+ PDF Chat Updating Language Models with Unstructured Facts: Towards Practical Knowledge Editing 2024 Xiaobao Wu
Liangming Pan
William Yang Wang
Anh Tuan Luu
+ PDF Chat Knowledge Editing with Dynamic Knowledge Graphs for Multi-hop Question Answering 2024 Yifan Lu
Yiming Zhou
Jing Li
Yequan Wang
Xuebo Liu
Daojing He
Fangming Liu
Min Zhang
+ PDF Chat EpiK-Eval: Evaluation for Language Models as Epistemic Models 2023 Gabriele Prato
Jerry Huang
Prasanna Parthasarathi
Shagun Sodhani
Sarath Chandar
+ PDF Chat MLaKE: Multilingual Knowledge Editing Benchmark for Large Language Models 2024 Zihao Wei
Jingcheng Deng
Liang Pang
Hanxing Ding
Huawei Shen
Xueqi Cheng
+ Evaluating Dependencies in Fact Editing for Language Models: Specificity and Implication Awareness 2023 Zichao Li
Ines Arous
Siva Reddy
Jackie Chi Kit Cheung
+ PDF Chat Locate-then-edit for Multi-hop Factual Recall under Knowledge Editing 2024 Zhuoran Zhang
Yongxiang Li
Zhengyan Kan
Keyuan Cheng
Lijie Hu
Di Wang
+ Evaluating Dependencies in Fact Editing for Language Models: Specificity and Implication Awareness 2023 Zichao Li
Ines Arous
Siva Reddy
Jackie Chi Kit Cheung

Works Cited by This (27)

Action Title Year Authors
+ PDF Chat How Can We Know What Language Models Know? 2020 Zhengbao Jiang
Frank F. Xu
Jun Araki
Graham Neubig
+ AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts 2020 Taylor Shin
Yasaman Razeghi
Robert L. Logan
Eric Wallace
Sameer Singh
+ Modifying Memories in Transformer Models 2020 Chen Zhu
Ankit Singh Rawat
Manzil Zaheer
Srinadh Bhojanapalli
Daliang Li
Felix Yu
Sanjiv Kumar
+ PDF Chat Factual Probing Is [MASK]: Learning vs. Learning to Recall 2021 Zexuan Zhong
Dan Friedman
Danqi Chen
+ Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs 2021 Peter Hase
Mona Diab
Aslı Çelikyılmaz
Xian Li
Zornitsa Kozareva
Veselin Stoyanov
Mohit Bansal
Srinivasan Iyer
+ PDF Chat Editing Factual Knowledge in Language Models 2021 Nicola De Cao
Wilker Aziz
Ivan Titov
+ Chain-of-Thought Prompting Elicits Reasoning in Large Language Models 2022 Jason Lee
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Ed H.
Quoc V. Le
Denny Zhou
+ PDF Chat Towards Unsupervised Dense Information Retrieval with Contrastive Learning 2021 Gautier Izacard
Mathilde Caron
Lucas Hosseini
Sebastian Riedel
Piotr Bojanowski
Armand Joulin
Édouard Grave
+ Training language models to follow instructions with human feedback 2022 Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
Pamela Mishkin
Chong Zhang
Sandhini Agarwal
Katarina Slama
Alex Ray
+ Least-to-Most Prompting Enables Complex Reasoning in Large Language Models 2022 Denny Zhou
Nathanael Schärli
Le Hou
Jason Lee
Nathan Scales
Xuezhi Wang
Dale Schuurmans
Claire Cui
Olivier Bousquet
Quoc V. Le