Cross-Task Generalization via Natural Language Crowdsourcing Instructions

Type: Article

Publication Date: 2022-01-01

Citations: 180

DOI: https://doi.org/10.18653/v1/2022.acl-long.244

Abstract

Humans (e.g., crowdworkers) have a remarkable ability in solving different tasks, by simply reading textual instructions that define them and looking at a few examples. Despite the success of the conventional supervised learning on individual datasets, such models often struggle with generalization across tasks (e.g., a question-answering system cannot solve classification tasks). A long-standing challenge in AI is to build a model that learns a new task by understanding the human-readable instructions that define it. To study this, we introduce NATURAL INSTRUCTIONS, a dataset of 61 distinct tasks, their human-authored instructions, and 193k task instances (input-output pairs). The instructions are obtained from crowdsourcing instructions used to create existing NLP datasets and mapped to a unified schema. Using this meta-dataset, we measure cross-task generalization by training models on seen tasks and measuring generalization to the remaining unseen ones. We adopt generative pre-trained language models to encode task-specific instructions along with input and generate task output. Our results indicate that models benefit from instructions when evaluated in terms of generalization to unseen tasks (19% better for models utilizing instructions). These models, however, are far behind an estimated performance upperbound indicating significant room for more progress in this direction.

Locations

  • arXiv (Cornell University) - View - PDF
  • Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) - View - PDF

Similar Works

Action Title Year Authors
+ Cross-Task Generalization via Natural Language Crowdsourcing Instructions 2021 Swaroop Mishra
Daniel Khashabi
Chitta Baral
Hannaneh Hajishirzi
+ Natural Instructions: Benchmarking Generalization to New Tasks from Natural Language Instructions 2021 Swaroop Mishra
Daniel Khashabi
Chitta Baral
Hannaneh Hajishirzi
+ Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor 2022 Or Honovich
Thomas Scialom
Omer Levy
Timo Schick
+ Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor 2023 Or Honovich
Thomas Scialom
Omer Levy
Timo Schick
+ Learning from Task Descriptions 2020 Orion Weller
Nicholas Lourie
Matt Gardner
Matthew E. Peters
+ Learning from Task Descriptions 2020 Orion Weller
Nicholas Lourie
Matt Gardner
Matthew E. Peters
+ Learning from Task Descriptions 2020 Orion Weller
Nicholas Lourie
Matt Gardner
Matthew E. Peters
+ Task Ambiguity in Humans and Language Models 2022 Alex Tamkin
Kunal Handa
Avash Shrestha
Noah D. Goodman
+ What Ingredients Make for an Effective Crowdsourcing Protocol for Difficult NLU Data Collection Tasks? 2021 Nikita Nangia
Saku Sugawara
Harsh Trivedi
Alex Warstadt
Clara Vania
Samuel R. Bowman
+ What Ingredients Make for an Effective Crowdsourcing Protocol for Difficult NLU Data Collection Tasks 2021 Nikita Nangia
Saku Sugawara
Harsh Trivedi
Alex Warstadt
Clara Vania
Samuel R. Bowman
+ The Turking Test: Can Language Models Understand Instructions? 2020 Avia Efrat
Omer Levy
+ The Turking Test: Can Language Models Understand Instructions? 2020 Avia Efrat
Omer Levy
+ ChatGPT to Replace Crowdsourcing of Paraphrases for Intent Classification: Higher Diversity and Comparable Model Robustness 2023 Ján Čegiň
Jakub Šimko
Peter Brusilovsky
+ PDF Chat ChatGPT to Replace Crowdsourcing of Paraphrases for Intent Classification: Higher Diversity and Comparable Model Robustness 2023 Ján Čegiň
Jakub Šimko
Peter Brusilovsky
+ PDF Chat Fine-tuned Language Models are Continual Learners 2022 Thomas Scialom
Tuhin Chakrabarty
Smaranda Muresan
+ Instruction Induction: From Few Examples to Natural Language Task Descriptions 2022 Or Honovich
Uri Shaham
Samuel R. Bowman
Omer Levy
+ Instruction Induction: From Few Examples to Natural Language Task Descriptions 2023 Or Honovich
Uri Shaham
Samuel R. Bowman
Omer Levy
+ PDF Chat SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning 2024 Chenyang Zhao
Xueying Jia
Vijay Viswanathan
Tongshuang Wu
Graham Neubig
+ Robustness of Learning from Task Instructions 2022 Jiasheng Gu
Hanzi Xu
Liangyu Nie
Wenpeng Yin
+ PDF Chat Robustness of Learning from Task Instructions 2023 Jiasheng Gu
Hongyu Zhao
Hanzi Xu
Liangyu Nie
Hongyuan Mei
Wenpeng Yin

Works That Cite This (128)

Action Title Year Authors
+ "It Felt Like Having a Second Mind": Investigating Human-AI Co-creativity in Prewriting with Large Language Models 2023 Qian Wan
Siying Hu
Yu Zhang
Piaohong Wang
Bo Wen
Zhicong Lu
+ PDF Chat <b>generAItor:</b> Tree-in-the-Loop Text Generation for Language Model Explainability and Adaptation 2024 Thilo Spinner
Rebecca Kehlbeck
Rita Sevastjanova
Tobias Stähle
Daniel A. Keim
Oliver Deußen
Mennatallah El‐Assady
+ PDF Chat ConTinTin: Continual Learning from Task Instructions 2022 Wenpeng Yin
Jia Li
Caiming Xiong
+ PDF Chat MEGA: Multilingual Evaluation of Generative AI 2023 Kabir Ahuja
Harshita Diddee
Rishav Hada
Millicent Ochieng
Krithika Ramesh
Prachi Jain
Akshay Nambi
Tanuja Ganu
Sameer Segal
Mohamed Ahmed
+ PDF Chat Best Prompts for Text-to-Image Models and How to Find Them 2023 Nikita Pavlichenko
Dmitry Ustalov
+ MAPO: Boosting Large Language Model Performance with Model-Adaptive Prompt Optimization 2023 Yuyan Chen
Zhihao Wen
Ge Fan
Zhengyu Chen
Wei Wu
Dayiheng Liu
Zhixu Li
Bang Liu
Yanghua Xiao
+ LogiCoT: Logical Chain-of-Thought Instruction Tuning 2023 Hanmeng Liu
Zhiyang Teng
Leyang Cui
Chaoli Zhang
Qiji Zhou
Yue Zhang
+ PDF Chat Instruct and Extract: Instruction Tuning for On-Demand Information Extraction 2023 Yizhu Jiao
Ming Zhong
Sha Li
Ruining Zhao
Siru Ouyang
Heng Ji
Jiawei Han
+ PDF Chat The Curious Case of Hallucinatory (Un)answerability: Finding Truths in the Hidden States of Over-Confident Large Language Models 2023 Aviv Slobodkin
Omer Goldman
Avi Caciularu
Ido Dagan
Shauli Ravfogel
+ PDF Chat From Values to Opinions: Predicting Human Behaviors and Stances Using Value-Injected Large Language Models 2023 Dongjun Kang
Joonsuk Park
Yohan Jo
JinYeong Bak