Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training

Type: Preprint

Publication Date: 2021-01-01

Citations: 0

DOI: https://doi.org/10.48550/arxiv.2109.07306

Locations

  • arXiv (Cornell University) - View - PDF
  • DataCite API - View

Similar Works

Action Title Year Authors
+ PDF Chat Allocating Large Vocabulary Capacity for Cross-Lingual Language Model Pre-Training 2021 Bo Zheng
Li Dong
Shaohan Huang
Saksham Singhal
Wanxiang Che
Ting Liu
Song Xia
Furu Wei
+ MultiFiT: Efficient Multi-lingual Language Model Fine-tuning 2019 Julian Martin Eisenschlos
Sebastian Ruder
Piotr Czapla
Marcin Kardas
Sylvain Gugger
Jeremy Howard
+ MultiFiT: Efficient Multi-lingual Language Model Fine-tuning 2019 Julian Martin Eisenschlos
Sebastian Ruder
Piotr Czapla
Marcin Kardas
Sylvain Gugger
Jeremy Howard
+ PDF Chat UniBridge: A Unified Approach to Cross-Lingual Transfer Learning for Low-Resource Languages 2024 Trinh Pham
Khoi M. Le
Luu Anh Tuan
+ PDF Chat DEPT: Decoupled Embeddings for Pre-training Language Models 2024 Alex Iacob
Lorenzo Sani
Meghdad Kurmanji
William F. Shen
Xinchi Qiu
Dongqi Cai
Yan Gao
Nicholas D. Lane
+ PDF Chat Optimizing Low-Resource Language Model Training: Comprehensive Analysis of Multi-Epoch, Multi-Lingual, and Two-Stage Approaches 2024 Kosuke Akimoto
Masafumi Oyamada
+ PDF Chat Mitigating Language-Level Performance Disparity in mPLMs via Teacher Language Selection and Cross-lingual Self-Distillation 2024 Haozhe Zhao
Zefan Cai
Shuzheng Si
Liang Chen
Yufeng He
Kaikai An
Baobao Chang
+ An Efficient Multilingual Language Model Compression through Vocabulary Trimming 2023 Asahi Ushio
Yi Zhou
José Camacho-Collados
+ PDF Chat Exploring Design Choices for Building Language-Specific LLMs 2024 Atula Tejaswi
Nilesh Gupta
Eunsol Choi
+ Efficient Multilingual Language Model Compression through Vocabulary Trimming 2023 Asahi Ushio
Yi Zhou
José Camacho-Collados
+ PDF Chat Improving Multilingual Models with Language-Clustered Vocabularies 2020 Hyung Won Chung
Dan Garrette
Kiat Chuan Tan
Jason Riesa
+ Improving Multilingual Models with Language-Clustered Vocabularies 2020 Hyung Won Chung
Dan Garrette
Kiat Chuan Tan
Jason Riesa
+ Improving Multilingual Models with Language-Clustered Vocabularies 2020 Hyung Won Chung
Dan Garrette
Kiat Chuan Tan
Jason Riesa
+ PDF Chat A Survey on Large Language Models with Multilingualism: Recent Advances and New Frontiers 2024 Kaiyu Huang
Fengran Mo
Hongliang Li
You Li
Yuanchi Zhang
Wei-Jian Yi
Yulong Mao
Jinchen Liu
Yuzhuang Xu
Jinan Xu
+ MaLA-500: Massive Language Adaptation of Large Language Models 2024 Peiqin Lin
Shaoxiong Ji
Jörg Tiedemann
André F. T. Martins
Hinrich Schütze
+ PDF Chat An Empirical Comparison of Vocabulary Expansion and Initialization Approaches for Language Models 2024 Nandini Mundra
Aditya Nanda Kishore
Raj Dabre
Ratish Puduppully
Anoop Kunchukuttan
Mitesh M. Khapra
+ PDF Chat XTransplant: A Probe into the Upper Bound Performance of Multilingual Capability and Culture Adaptability in LLMs via Mutual Cross-lingual Feed-forward Transplantation 2024 Yangfan Ye
Xiaocheng Feng
Xiachong Feng
Lirong Qin
Yudong Huang
Lei Huang
Weitao Ma
Z. Zhang
Yunfei Lu
Xiaohui Yan
+ Analyzing and Reducing the Performance Gap in Cross-Lingual Transfer with Fine-tuning Slow and Fast 2023 Yiduo Guo
Yaobo Liang
Dongyan Zhao
Bing Liu
Nan Duan
+ Analyzing and Reducing the Performance Gap in Cross-Lingual Transfer with Fine-tuning Slow and Fast 2023 Yiduo Guo
Yaobo Liang
Dongyan Zhao
Bing Liu
Nan Duan
+ Strategies for Training Large Vocabulary Neural Language Models 2015 Welin Chen
David Grangier
Michael Auli

Works That Cite This (0)

Action Title Year Authors