FILTER: An Enhanced Fusion Method for Cross-lingual Language Understanding

Type: Article

Publication Date: 2021-05-18

Citations: 55

DOI: https://doi.org/10.1609/aaai.v35i14.17512

Abstract

Large-scale cross-lingual language models (LM), such as mBERT, Unicoder and XLM, have achieved great success in cross-lingual representation learning. However, when applied to zero-shot cross-lingual transfer tasks, most existing methods use only single-language input for LM finetuning, without leveraging the intrinsic cross-lingual alignment between different languages that proves essential for multilingual tasks. In this paper, we propose FILTER, an enhanced fusion method that takes cross-lingual data as input for XLM finetuning. Specifically, FILTER first encodes text input in the source language and its translation in the target language independently in the shallow layers, then performs cross-language fusion to extract multilingual knowledge in the intermediate layers, and finally performs further language-specific encoding. During inference, the model makes predictions based on the text input in the target language and its translation in the source language. For simple tasks such as classification, translated text in the target language shares the same label as the source language. However, this shared label becomes less accurate or even unavailable for more complex tasks such as question answering, NER and POS tagging. To tackle this issue, we further propose an additional KL-divergence self-teaching loss for model training, based on auto-generated soft pseudo-labels for translated text in the target language. Extensive experiments demonstrate that FILTER achieves new state of the art on two challenging multilingual multi-task benchmarks, XTREME and XGLUE.

Locations

  • Proceedings of the AAAI Conference on Artificial Intelligence - View - PDF
  • arXiv (Cornell University) - View - PDF

Similar Works

Action Title Year Authors
+ FILTER: An Enhanced Fusion Method for Cross-lingual Language Understanding 2020 Yuwei Fang
Shuohang Wang
Zhe Gan
Siqi Sun
Jun Liu
+ VECO: Variable and Flexible Cross-lingual Pre-training for Language Understanding and Generation 2020 Fuli Luo
Wei Wang
Jiahao Liu
Yijia Liu
Bin Bi
Songfang Huang
Fei Huang
Luo Si
+ On Learning Universal Representations Across Languages 2020 Xiangpeng Wei
Yue Hu
Rongxiang Weng
Luxi Xing
Heng Yu
Weihua Luo
+ InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training 2020 Zewen Chi
Li Dong
Furu Wei
Nan Yang
Saksham Singhal
Wenhui Wang
Song Xia
Xian-Ling Mao
Heyan Huang
Ming Zhou
+ Cross-Lingual Language Model Meta-Pretraining 2021 Zewen Chi
Heyan Huang
Luyang Liu
Yu Bai
Xian-Ling Mao
+ VECO: Variable Encoder-decoder Pre-training for Cross-lingual Understanding and Generation 2021 Fuli Luo
Wei Wang
Jiahao Liu
Yijia Liu
Bin Bi
Songfang Huang
Fei Huang
Luo Si
+ XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization 2020 Junjie Hu
Sebastian Ruder
Aditya Siddhant
Graham Neubig
Orhan FÄąrat
Melvin Johnson
+ Self-Augmentation Improves Zero-Shot Cross-Lingual Transfer 2023 Fei Wang
Kuan-Hao Huang
Kai-Wei Chang
Muhao Chen
+ PDF Chat Machine-Created Universal Language for Cross-Lingual Transfer 2024 Yaobo Liang
Quanzhi Zhu
Junhe Zhao
Nan Duan
+ Unicoder: A Universal Language Encoder by Pre-training with Multiple Cross-lingual Tasks 2019 Haoyang Huang
Yaobo Liang
Nan Duan
Ming Gong
Linjun Shou
Daxin Jiang
Ming Zhou
+ Enhancing Cross-lingual Transfer by Manifold Mixup 2022 Huiyun Yang
Huadong Chen
Hao Zhou
Lei Li
+ How Do Multilingual Encoders Learn Cross-lingual Representation? 2022 Shijie Wu
+ TransliCo: A Contrastive Learning Framework to Address the Script Barrier in Multilingual Pretrained Language Models 2024 Yihong Liu
Chunlan Ma
Haotian Ye
Hinrich SchĂźtze
+ Prompt-Tuning Can Be Much Better Than Fine-Tuning on Cross-lingual Understanding With Multilingual Language Models 2022 Lifu Tu
Caiming Xiong
Yingbo Zhou
+ Multi-Level Contrastive Learning for Cross-Lingual Alignment 2022 Beiduo Chen
Wu Guo
Bin Gu
Quan Liu
Yongchao Wang
+ Multi-Source Cross-Lingual Model Transfer: Learning What to Share 2019 Xilun Chen
Ahmed Hassan Awadallah
Hany Hassan
Wei Wang
Claire Cardie
+ Multi-Source Cross-Lingual Model Transfer: Learning What to Share 2018 Xilun Chen
Ahmed Hassan Awadallah
Hany Hassan
Wei Wang
Claire Cardie
+ Cross-lingual Transfer of Monolingual Models. 2021 Evangelia Gogoulou
Ariel Ekgren
Tim Isbister
Magnus Sahlgren
+ Prompt-Tuning Can Be Much Better Than Fine-Tuning on Cross-lingual Understanding With Multilingual Language Models 2022 Lifu Tu
Caiming Xiong
Yingbo Zhou
+ Cross-lingual Transfer of Monolingual Models 2021 Evangelia Gogoulou
Ariel Ekgren
Tim Isbister
Magnus Sahlgren

Works That Cite This (33)

Action Title Year Authors
+ MuRIL: Multilingual Representations for Indian Languages 2021 Simran Khanuja
Diksha Bansal
Sarvesh Mehtani
Savya Khosla
Atreyee Dey
Balaji Gopalan
Dilip Kumar Margam
Pooja Aggarwal
Rajiv Teja Nagipogu
Shachi Dave
+ English Intermediate-Task Training Improves Zero-Shot Cross-Lingual Transfer Too 2020 Jason Phang
Iacer Calixto
Phu Mon Htut
Yada Pruksachatkun
Haokun Liu
Clara Vania
Katharina Kann
Samuel R. Bowman
+ Bootstrapping Multilingual Semantic Parsers using Large Language Models 2023 Abhijeet Awasthi
Nitish Gupta
Bidisha Samanta
Shachi Dave
Sunita Sarawagi
Partha Talukdar
+ PDF Chat Revisiting Machine Translation for Cross-lingual Classification 2023 Mikel Artetxe
Vedanuj Goswami
Shruti Bhosale
Angela Fan
Luke Zettlemoyer
+ PDF Chat Synergy with Translation Artifacts for Training and Inference in Multilingual Tasks 2022 Jaehoon Oh
Jongwoo Ko
Se-Young Yun
+ mT5: A massively multilingual pre-trained text-to-text transformer 2020 Linting Xue
Noah Constant
Adam P. Roberts
Mihir Kale
Rami Al‐Rfou
Aditya Siddhant
Aditya Barua
Colin Raffel
+ PDF Chat SILT: Efficient transformer training for inter-lingual inference 2022 Javier Huertas‐Tato
Alejandro MartĂ­n
David Camacho
+ Consistency Regularization for Cross-Lingual Fine-Tuning 2021 Bo Zheng
Li Dong
Shaohan Huang
Wenhui Wang
Zewen Chi
Saksham Singhal
Wanxiang Che
Ting Liu
Song Xia
Furu Wei
+ nmT5 -- Is parallel data still relevant for pre-training massively multilingual language models? 2021 Mihir Kale
Aditya Siddhant
Noah Constant
Melvin Johnson
Rami Al‐Rfou
Linting Xue
+ DualNER: A Dual-Teaching framework for Zero-shot Cross-lingual Named Entity Recognition 2022 Jiali Zeng
Yufan Jiang
Yongjing Yin
Xu Wang
Binghuai Lin
Yunbo Cao