Generative Pre-trained Transformer: A Comprehensive Review on Enabling Technologies, Potential Applications, Emerging Challenges, and Future Directions

Type: Review

Publication Date: 2023-01-01

Citations: 36

DOI: https://doi.org/10.48550/arxiv.2305.10435

Locations

  • arXiv (Cornell University) - View - PDF
  • DataCite API - View

Similar Works

Action Title Year Authors
+ Examining User-Friendly and Open-Sourced Large GPT Models: A Survey on Language, Multimodal, and Scientific GPT Models 2023 Kaiyuan Gao
Sunan He
Zhenyu He
Jiacheng Lin
Qizhi Pei
Jie Shao
Wei Zhang
+ AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing 2021 Katikapalli Subramanyam Kalyan
Ajit Rajasekharan
S. Sangeetha
+ AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing 2021 Katikapalli Subramanyam Kalyan
Ajit Rajasekharan
S. Sangeetha
+ FPM: A Collection of Large-scale Foundation Pre-trained Language Models 2021 Dezhou Shen
+ HuggingFace's Transformers: State-of-the-art Natural Language Processing 2019 Thomas Wolf
Lysandre Debut
Victor Sanh
Julien Chaumond
Clément Delangue
Anthony Moi
Pierric Cistac
Tim Rault
Rémi Louf
Morgan Funtowicz
+ All NLP Tasks Are Generation Tasks: A General Pretraining Framework 2021 Zhengxiao Du
Yujie Qian
Xiao Liu
Ming Ding
Jiezhong Qiu
Zhilin Yang
Jie Tang
+ A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT 2023 Ce Zhou
Qian Li
Chen Li
Jun Yu
Yixin Liu
Guangjing Wang
Kai Zhang
Cheng Ji
Qiben Yan
Lifang He
+ Efficient Long-Range Transformers: You Need to Attend More, but Not Necessarily at Every Layer 2023 Qingru Zhang
Dhananjay Ram
Cole Hawkins
Sheng Zha
Tuo Zhao
+ Efficient Long-Range Transformers: You Need to Attend More, but Not Necessarily at Every Layer 2023 Qingru Zhang
Dhananjay Ram
Cole Hawkins
Sheng Zha
Tuo Zhao
+ PDF Chat CTRAN: CNN-Transformer-based network for natural language understanding 2023 Mehrdad Rafiepour
Javad Salimi Sartakhti
+ Can Bidirectional Encoder Become the Ultimate Winner for Downstream Applications of Foundation Models? 2024 L. Yang
Xuanyu Zhou
Jianqing Fan
Xinyi Xie
Shengxin Zhu
+ PDF Chat Engineering A Large Language Model From Scratch 2024 Abiodun Finbarrs Oketunji
+ Engineering A Large Language Model From Scratch 2024 Abiodun Finbarrs Oketunji
+ PDF Chat A Survey on Large Language Models from Concept to Implementation 2024 Chen Wang
Zhao Jin
Jiaqi Gong
+ Finding Skill Neurons in Pre-trained Transformer-based Language Models 2022 Xiaozhi Wang
Kaiyue Wen
Zhengyan Zhang
Lei Hou
Zhiyuan Liu
Juanzi Li
+ PDF Chat Finding Skill Neurons in Pre-trained Transformer-based Language Models 2022 Xiaozhi Wang
Kaiyue Wen
Zhengyan Zhang
Lei Hou
Zhiyuan Liu
Juanzi Li
+ PDF Chat Can bidirectional encoder become the ultimate winner for downstream applications of foundation models? 2024 L. Yang
Xuanyu Zhou
Jianqing Fan
Xinyi Xie
Shengxin Zhu
+ Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model 2022 Shaden Smith
Mostofa Patwary
Brandon Norick
Patrick LeGresley
Samyam Rajbhandari
Jared Casper
Zhun Liu
Shrimai Prabhumoye
George Zerveas
Vijay Anand Korthikanti
+ ETC: Encoding Long and Structured Inputs in Transformers 2020 Joshua Ainslie
Santiago Ontañón
Chris Alberti
Vaclav Cvicek
Zachary Fisher
Philip Pham
Anirudh Ravula
Sumit Sanghai
Qifan Wang
Yang Li
+ PDF Chat Development of Pre-Trained Transformer-based Models for the Nepali Language 2024 P Thapa
Jinu Nyachhyon
Mridul Sharma
Bal Krishna Bal

Works Cited by This (0)

Action Title Year Authors