+
PDF
Chat
|
Contrastive Augmented Graph2Graph Memory Interaction for Few Shot Continual Learning
|
2025
|
Biqing Qi
Junqi Gao
Xinquan Chen
Li Dong
Jianxing Liu
Ligang Wu
Bowen Zhou
|
+
PDF
Chat
|
Multimodal Latent Language Modeling with Next-Token Diffusion
|
2024
|
Yutao Sun
Hangbo Bao
Wenhui Wang
Zhiliang Peng
Li Dong
Shaohan Huang
Jianyong Wang
Furu Wei
|
+
PDF
Chat
|
Differential Transformer
|
2024
|
Tianzhu Ye
Li Dong
Yuqing Xia
Yutao Sun
Yi Zhu
Gao Huang
Furu Wei
|
+
PDF
Chat
|
FinCon: A Synthesized LLM Multi-Agent System with Conceptual Verbal
Reinforcement for Enhanced Financial Decision Making
|
2024
|
Yangyang Yu
Zhiyuan Yao
Haohang Li
Zhiyang Deng
Yupeng Cao
Zhi Chen
Jordan W. Suchow
Rong Liu
Zhenyu Cui
Denghui Zhang
|
+
|
PG-LBO: Enhancing High-Dimensional Bayesian Optimization with Pseudo-Label and Gaussian Process Guidance
|
2024
|
Taicai Chen
Yue Duan
Li Dong
Qi Lei
Yinghuan Shi
Yang Gao
|
+
|
Knowledge Distillation of Large Language Models
|
2023
|
Yuxian Gu
Li Dong
Furu Wei
Minlie Huang
|
+
|
GanLM: Encoder-Decoder Pre-training with an Auxiliary Discriminator
|
2023
|
Jian Yang
Shuming Ma
Li Dong
Shaohan Huang
Haoyang Huang
Yuwei Yin
Dongdong Zhang
Liqun Yang
Furu Wei
Zhoujun Li
|
+
|
Beyond English-Centric Bitexts for Better Multilingual Language Representation Learning
|
2023
|
Barun Patra
Saksham Singhal
Shaohan Huang
Zewen Chi
Li Dong
Furu Wei
Vishrav Chaudhary
Song Xia
|
+
|
A Length-Extrapolatable Transformer
|
2023
|
Yutao Sun
Li Dong
Barun Patra
Shuming Ma
Shaohan Huang
Alon Benhaim
Vishrav Chaudhary
Song Xia
Furu Wei
|
+
PDF
Chat
|
Transforming Wikipedia Into Augmented Data for Query-Focused Summarization
|
2022
|
Haichao Zhu
Li Dong
Furu Wei
Bing Qin
Ting Liu
|
+
PDF
Chat
|
Knowledge Neurons in Pretrained Transformers
|
2022
|
Damai Dai
Li Dong
Yaru Hao
Zhifang Sui
Baobao Chang
Furu Wei
|
+
PDF
Chat
|
XLM-E: Cross-lingual Language Model Pre-training via ELECTRA
|
2022
|
Zewen Chi
Shaohan Huang
Li Dong
Shuming Ma
Bo Zheng
Saksham Singhal
Payal Bajaj
Song Xia
Xian-Ling Mao
Heyan Huang
|
+
PDF
Chat
|
Self-Attention Attribution: Interpreting Information Interactions Inside Transformer
|
2021
|
Yaru Hao
Li Dong
Furu Wei
Ke Xu
|
+
PDF
Chat
|
Towards Effective Context for Meta-Reinforcement Learning: an Approach based on Contrastive Learning
|
2021
|
Haotian Fu
Hongyao Tang
Jianye Hao
Chen Chen
Xidong Feng
Li Dong
Wulong Liu
|
+
PDF
Chat
|
Zero-Shot Cross-Lingual Transfer of Neural Machine Translation with Multilingual Pretrained Encoders
|
2021
|
Guanhua Chen
Shuming Ma
Yun Chen
Li Dong
Dongdong Zhang
Jia Pan
Wenping Wang
Furu Wei
|
+
PDF
Chat
|
mT6: Multilingual Pretrained Text-to-Text Transformer with Translation Pairs
|
2021
|
Zewen Chi
Li Dong
Shuming Ma
Shaohan Huang
Saksham Singhal
Xian-Ling Mao
Heyan Huang
Song Xia
Furu Wei
|
+
|
Consistency Regularization for Cross-Lingual Fine-Tuning
|
2021
|
Bo Zheng
Li Dong
Shaohan Huang
Wenhui Wang
Zewen Chi
Saksham Singhal
Wanxiang Che
Ting Liu
Song Xia
Furu Wei
|
+
|
InfoBehavior: Self-supervised Representation Learning for Ultra-long Behavior Sequence via Hierarchical Grouping
|
2021
|
Runshi Liu
Pengda Qin
Yuhong Li
Weigao Wen
Li Dong
Kefeng Deng
Qiang Wu
|
+
|
Improving Pretrained Cross-Lingual Language Models via Self-Labeled Word Alignment
|
2021
|
Zewen Chi
Li Dong
Bo Zheng
Shaohan Huang
Xian-Ling Mao
Heyan Huang
Furu Wei
|
+
PDF
Chat
|
InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training
|
2021
|
Zewen Chi
Li Dong
Furu Wei
Nan Yang
Saksham Singhal
Wenhui Wang
Song Xia
Xian-Ling Mao
Heyan Huang
Ming Zhou
|
+
|
MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers
|
2021
|
Wenhui Wang
Hangbo Bao
Shaohan Huang
Li Dong
Furu Wei
|
+
|
Learning to Sample Replacements for ELECTRA Pre-Training
|
2021
|
Yaru Hao
Li Dong
Hangbo Bao
Ke Xu
Furu Wei
|
+
|
Consistency Regularization for Cross-Lingual Fine-Tuning
|
2021
|
Bo Zheng
Li Dong
Shaohan Huang
Wenhui Wang
Zewen Chi
Saksham Singhal
Wanxiang Che
Ting Liu
Song Xia
Furu Wei
|
+
|
DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders
|
2021
|
Shuming Ma
Li Dong
Shaohan Huang
Dongdong Zhang
Alexandre Muzio
Saksham Singhal
Hany Hassan Awadalla
Song Xia
Furu Wei
|
+
|
Learning to Sample Replacements for ELECTRA Pre-Training
|
2021
|
Yaru Hao
Li Dong
Hangbo Bao
Ke Xu
Furu Wei
|
+
|
Adapt-and-Distill: Developing Small, Fast and Effective Pretrained Language Models for Domains
|
2021
|
Yunzhi Yao
Shaohan Huang
Wenhui Wang
Li Dong
Furu Wei
|
+
|
Improving Pretrained Cross-Lingual Language Models via Self-Labeled Word Alignment
|
2021
|
Zewen Chi
Li Dong
Bo Zheng
Shaohan Huang
Xian-Ling Mao
Heyan Huang
Furu Wei
|
+
|
A Semi-supervised Multi-task Learning Approach to Classify Customer Contact Intents
|
2021
|
Li Dong
Matthew C. Spencer
Amir Biagi
|
+
PDF
Chat
|
Allocating Large Vocabulary Capacity for Cross-Lingual Language Model Pre-Training
|
2021
|
Bo Zheng
Li Dong
Shaohan Huang
Saksham Singhal
Wanxiang Che
Ting Liu
Song Xia
Furu Wei
|
+
|
MT6: Multilingual Pretrained Text-to-Text Transformer with Translation Pairs
|
2021
|
Zewen Chi
Li Dong
Shuming Ma
Shaohan Huang Xian-Ling Mao
Heyan Huang
Furu Wei
|
+
|
Zero-shot Cross-lingual Transfer of Neural Machine Translation with Multilingual Pretrained Encoders
|
2021
|
Guanhua Chen
Shuming Ma
Yun Chen
Li Dong
Dongdong Zhang
Jia Pan
Wenping Wang
Furu Wei
|
+
|
s2s-ft: Fine-Tuning Pretrained Transformer Encoders for Sequence-to-Sequence Learning
|
2021
|
Hangbo Bao
Li Dong
Wenhui Wang
Nan Yang
Furu Wei
|
+
|
Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task
|
2021
|
Jian Yang
Shuming Ma
Haoyang Huang
Dongdong Zhang
Li Dong
Shaohan Huang
Alexandre Muzio
Saksham Singhal
Hany Hassan Awadalla
Song Xia
|
+
|
Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training
|
2021
|
Bo Zheng
Li Dong
Shaohan Huang
Saksham Singhal
Wanxiang Che
Ting Liu
Song Xia
Furu Wei
|
+
PDF
Chat
|
UniLMv2: Pseudo-Masked Language Models for Unified Language Model
Pre-Training
|
2020
|
Hangbo Bao
Li Dong
Furu Wei
Wenhui Wang
Nan Yang
Xiaodong Liu
Yu Wang
Jianfeng Gao
Songhao Piao
Ming Zhou
|
+
|
UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training
|
2020
|
Hangbo Bao
Li Dong
Furu Wei
Wenhui Wang
Nan Yang
Xiaodong Liu
Yu Wang
Songhao Piao
Jianfeng Gao
Ming Zhou
|
+
|
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers
|
2020
|
Wenhui Wang
Furu Wei
Li Dong
Hangbo Bao
Nan Yang
Ming Zhou
|
+
|
Self-Attention Attribution: Interpreting Information Interactions Inside Transformer
|
2020
|
Yaru Hao
Li Dong
Furu Wei
Ke Xu
|
+
|
Harvesting and Refining Question-Answer Pairs for Unsupervised QA
|
2020
|
Zhongli Li
Wenhui Wang
Li Dong
Furu Wei
Ke Xu
|
+
|
Harvesting and Refining Question-Answer Pairs for Unsupervised QA
|
2020
|
Zhongli Li
Wenhui Wang
Li Dong
Furu Wei
Ke Xu
|
+
|
InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training
|
2020
|
Zewen Chi
Li Dong
Furu Wei
Nan Yang
Saksham Singhal
Wenhui Wang
Song Xia
Xian-Ling Mao
Heyan Huang
Ming Zhou
|
+
|
Towards Effective Context for Meta-Reinforcement Learning: an Approach based on Contrastive Learning
|
2020
|
Haotian Fu
Hongyao Tang
Jianye Hao
Chen Chen
Xidong Feng
Li Dong
Wulong Liu
|
+
|
MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers
|
2020
|
Wenhui Wang
Hangbo Bao
Shaohan Huang
Li Dong
Furu Wei
|
+
|
XLM-T: Scaling up Multilingual Machine Translation with Pretrained Cross-lingual Transformer Encoders
|
2020
|
Shuming Ma
Jian Yang
Haoyang Huang
Zewen Chi
Li Dong
Dongdong Zhang
Hany Hassan Awadalla
Alexandre Muzio
Akiko Eriguchi
Saksham Singhal
|
+
|
Unified Language Model Pre-training for Natural Language Understanding and Generation
|
2019
|
Li Dong
Nan Yang
Wenhui Wang
Furu Wei
Xiaodong Liu
Yu Wang
Jianfeng Gao
Ming Zhou
Hsiao-Wuen Hon
|
+
|
Data-to-text Generation with Entity Modeling
|
2019
|
Ratish Puduppully
Li Dong
Mirella Lapata
|
+
|
Learning to Ask Unanswerable Questions for Machine Reading Comprehension
|
2019
|
Haichao Zhu
Li Dong
Furu Wei
Wenhui Wang
Bing Qin
Ting Liu
|
+
|
Visualizing and Understanding the Effectiveness of BERT
|
2019
|
Yaru Hao
Li Dong
Furu Wei
Ke Xu
|
+
|
Unified Language Model Pre-training for Natural Language Understanding and Generation
|
2019
|
Li Dong
Nan Yang
Wenhui Wang
Furu Wei
Xiaodong Liu
Yu Wang
Jianfeng Gao
Ming Zhou
Hsiao-Wuen Hon
|
+
|
Learning A Unified Named Entity Tagger From Multiple Partially Annotated Corpora For Efficient Adaptation
|
2019
|
Xiao Huang
Li Dong
Elizabeth Boschee
Nanyun Peng
|
+
|
Can Monolingual Pretrained Models Help Cross-Lingual Classification?
|
2019
|
Zewen Chi
Li Dong
Furu Wei
Xian-Ling Mao
Heyan Huang
|
+
PDF
Chat
|
Learning a Unified Named Entity Tagger from Multiple Partially Annotated Corpora for Efficient Adaptation
|
2019
|
Xiao Huang
Li Dong
Elizabeth Boschee
Nanyun Peng
|
+
PDF
Chat
|
Coarse-to-Fine Decoding for Neural Semantic Parsing
|
2018
|
Li Dong
Mirella Lapata
|
+
PDF
Chat
|
Confidence Modeling for Neural Semantic Parsing
|
2018
|
Li Dong
Chris Quirk
Mirella Lapata
|
+
|
Learning Structured Semantic Embeddings for Visual Recognition
|
2017
|
Li Dong
Hsin-Ying Lee
Jia‐Bin Huang
Shengjin Wang
Ming–Hsuan Yang
|
+
PDF
Chat
|
Learning to Paraphrase for Question Answering
|
2017
|
Li Dong
Jonathan Mallinson
Siva Reddy
Mirella Lapata
|
+
PDF
Chat
|
Long Short-Term Memory-Networks for Machine Reading
|
2016
|
Jianpeng Cheng
Li Dong
Mirella Lapata
|
+
PDF
Chat
|
Language to Logical Form with Neural Attention
|
2016
|
Li Dong
Mirella Lapata
|
+
|
A Statistical Parsing Framework for Sentiment Classification
|
2014
|
Li Dong
Furu Wei
Shujie Liu
Ming Zhou
Ke Xu
|