Overcoming Catastrophic Forgetting in Zero-Shot Cross-Lingual Generation

Type: Article

Publication Date: 2022-01-01

Citations: 15

DOI: https://doi.org/10.18653/v1/2022.emnlp-main.630

Abstract

In this paper, we explore the challenging problem of performing a generative task in a target language when labeled data is only available in English, using summarization as a case study. We assume a strict setting with no access to parallel data or machine translation and find that common transfer learning approaches struggle in this setting, as a generative multilingual model fine-tuned purely on English catastrophically forgets how to generate non-English. Given the recent rise of parameter-efficient adaptation techniques, we conduct the first investigation into how one such method, prompt tuning (Lester et al., 2021), can overcome catastrophic forgetting to enable zero-shot cross-lingual generation. Our experiments show that parameter-efficient prompt tuning provides gains over standard fine-tuning when transferring between less-related languages, e.g., from English to Thai. However, a significant gap still remains between these methods and fully-supervised baselines. To improve cross-lingual transfer further, we explore several approaches, including: (1) mixing in unlabeled multilingual data, and (2) explicitly factoring prompts into recombinable language and task components. Our approaches can provide further quality gains, suggesting that robust zero-shot cross-lingual generation is within reach.

Locations

  • arXiv (Cornell University) - View - PDF
  • Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing - View - PDF

Works Cited by This (45)

Action Title Year Authors
+ Overcoming catastrophic forgetting in neural networks 2017 James Kirkpatrick
Razvan Pascanu
Neil C. Rabinowitz
Joel Veness
Guillaume Desjardins
Andrei A. Rusu
Kieran Milan
John Quan
Tiago Ramalho
Agnieszka Grabska‐Barwińska
+ PDF Chat Get To The Point: Summarization with Pointer-Generator Networks 2017 Abigail See
Peter J. Liu
Christopher D. Manning
+ PDF Chat XNLI: Evaluating Cross-lingual Sentence Representations 2018 Alexis Conneau
Ruty Rinott
Guillaume Lample
Adina Williams
Samuel Bowman
Holger Schwenk
Veselin Stoyanov
+ BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding 2018 Jacob Devlin
Ming‐Wei Chang
Kenton Lee
Kristina Toutanova
+ Cross-lingual Language Model Pretraining 2019 Guillaume Lample
Alexis Conneau
+ PDF Chat Universal Language Model Fine-tuning for Text Classification 2018 Jeremy Howard
Sebastian Ruder
+ SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing 2018 Taku Kudo
John T. E. Richardson
+ Parameter-Efficient Transfer Learning for NLP 2019 Neil Houlsby
Andrei Giurgiu
Stanisław Jastrzȩbski
Bruna Morrone
Quentin de Laroussilhe
Andréa Gesmundo
Mona Attariyan
Sylvain Gelly
+ PAWS-X: A Cross-lingual Adversarial Dataset for Paraphrase Identification 2019 Yinfei Yang
Yuan Zhang
Chris Tar
Jason Baldridge
+ PDF Chat Cross-Lingual Natural Language Generation via Pre-Training 2020 Zewen Chi
Dong Li
Furu Wei
Wenhui Wang
Xian-Ling Mao
Heyan Huang