XPipe: Efficient Pipeline Model Parallelism for Multi-GPU DNN Training

Type: Preprint

Publication Date: 2019-01-01

Citations: 33

DOI: https://doi.org/10.48550/arxiv.1911.04610

Locations

  • arXiv (Cornell University) - View
  • DataCite API - View

Similar Works

Action Title Year Authors
+ HetPipe: Enabling Large DNN Training on (Whimpy) Heterogeneous GPU Clusters through Integration of Pipelined Model Parallelism and Data Parallelism 2020 Jay Park
Gyeongchan Yun
Chang M. Yi
Nguyen T. Nguyen
Seungmin Lee
Jaesik Choi
Sam H. Noh
Young-ri Choi
+ HetPipe: Enabling Large DNN Training on (Whimpy) Heterogeneous GPU Clusters through Integration of Pipelined Model Parallelism and Data Parallelism 2020 Jay Park
Gyeongchan Yun
Chang M. Yi
Nguyen T. Nguyen
Seungmin Lee
Jaesik Choi
Sam H. Noh
Young-ri Choi
+ Efficient and Robust Parallel DNN Training through Model Parallelism on Multi-GPU Platform 2018 Chi‐Chung Chen
Chia-Lin Yang
Hsiang-Yun Cheng
+ PDF Chat BitPipe: Bidirectional Interleaved Pipeline Parallelism for Accelerating Large Models Training 2024 Houming Wu
Ling Chen
Wenjie Yu
+ DAPPLE: A Pipelined Data Parallel Approach for Training Large Models 2020 Shiqing Fan
Yi Rong
Meng Chen
Zongyan Cao
Siyu Wang
Zhen Zheng
Chuan Wu
Guoping Long
Jun Yang
Lixue Xia
+ PDF Chat Efficient Pipeline Planning for Expedited Distributed DNN Training 2022 Ziyue Luo
Xiaodong Yi
Guoping Long
Shiqing Fan
Chuan Wu
Jun Yang
Wei Lin
+ PDF Chat TiMePReSt: Time and Memory Efficient Pipeline Parallel DNN Training with Removed Staleness 2024 A. Dutta
Nabendu Chaki
Rajat K. De
+ BaPipe: Exploration of Balanced Pipeline Parallelism for DNN Training 2020 Letian Zhao
Rui Xu
Tianqi Wang
Teng Tian
Xiaotian Wang
Wei Wu
Chio-in Ieong
Xi Jin
+ PipeDream: Fast and Efficient Pipeline Parallel DNN Training. 2018 Aaron Harlap
Deepak Narayanan
Amar Phanishayee
Vivek Seshadri
Nikhil R. Devanur
Gregory R. Ganger
Phillip B. Gibbons
+ PipeDream: Fast and Efficient Pipeline Parallel DNN Training 2018 Aaron Harlap
Deepak Narayanan
Amar Phanishayee
Vivek Seshadri
Nikhil R. Devanur
Greg Ganger
Phil Gibbons
+ PDF Chat GraphPipe: Improving Performance and Scalability of DNN Training with Graph Pipeline Parallelism 2024 Byungsoo Jeon
Mengdi Wu
Shiyi Cao
Sunghyun Kim
Sunghyun Park
Neeraj Aggarwal
Colin Unger
Daiyaan Arfeen
Peiyuan Liao
Xupeng Miao
+ Merak: An Efficient Distributed DNN Training Framework with Automated 3D Parallelism for Giant Foundation Models 2022 Zhiquan Lai
Shengwei Li
Xudong Tang
Keshi Ge
Weijie Liu
Yabo Duan
Linbo Qiao
Dongsheng Li
+ Opara: Exploiting Operator Parallelism for Expediting DNN Inference on GPUs 2023 Aodong Chen
Fei Xu
Han Li
Yuan Dong
Li Chen
Zhi Zhou
Fangming Liu
+ PDF Chat Faster Multi-GPU Training with PPLL: A Pipeline Parallelism Framework Leveraging Local Learning 2024 Xiuyuan Guo
Cheng Xu
Guangyu Guo -
Feiyu Zhu
Chunfang Cai
Peizhe Wang
Xiaoming Wei
Junhao Su
Jialin Gao
+ PipeMare: Asynchronous Pipeline Parallel DNN Training 2019 Bowen Yang
Jian Zhang
Jonathan Li
Christopher Ré
Christopher R. Aberger
Christopher De
+ PipeMare: Asynchronous Pipeline Parallel DNN Training 2019 Bowen Yang
Jian Zhang
Jonathan Li
Christopher Ré
Christopher R. Aberger
Christopher De
+ Whale: A Unified Distributed Training Framework. 2020 Ang Wang
Xianyan Jia
Le Jiang
Jie Zhang
Yong Li
Wei Lin
+ Beyond Data and Model Parallelism for Deep Neural Networks 2018 Zhihao Jia
Matei Zaharia
Alex Aiken
+ Beyond Data and Model Parallelism for Deep Neural Networks 2018 Zhihao Jia
Matei Zaharia
Alex Aiken
+ Workload-aware Automatic Parallelization for Multi-GPU DNN Training 2018 Sungho Shin
Youngmin Jo
Jungwook Choi
Swagath Venkataramani
Vijayalakshmi Srinivasan
Wonyong Sung

Works That Cite This (19)

Action Title Year Authors
+ PDF Chat DISCO: Distributed Inference with Sparse Communications 2024 Minghai Qin
Chao Sun
Jaco Hofmann
Dejan Vučinić
+ PDF Chat DistrEdge: Speeding up Convolutional Neural Network Inference on Distributed Edge Devices 2022 Xueyu Hou
Yongjie Guan
Tao Han
Ning Zhang
+ Training Recommender Systems at Scale: Communication-Efficient Model and Data Parallelism 2020 Vipul Gupta
Dhruv Choudhary
Ping Tang
Xiaohan Wei
Xing Wang
Yuzhen Huang
Arun Kejariwal
Kannan Ramchandran
Michael W. Mahoney
+ Fast Distributed Training of Deep Neural Networks: Dynamic Communication Thresholding for Model and Data Parallelism. 2020 Vipul Gupta
Dhruv Choudhary
Ping Tang
Xiaohan Wei
Xing Wang
Yuzhen Huang
Arun Kejariwal
Kannan Ramchandran
Michael W. Mahoney
+ PDF Chat Pipelined Training with Stale Weights in Deep Convolutional Neural Networks 2021 Lifu Zhang
Tarek S. Abdelrahman
+ PDF Chat Deep Optimizer States: Towards Scalable Training of Transformer Models using Interleaved Offloading 2024 Avinash Maurya
Jie Ye
M. Mustafa Rafique
Franck Cappello
Bogdan Nicolae
+ PDF Chat Activations and Gradients Compression for Model-Parallel Training 2023 Mikhail Rudakov
Aleksandr Beznosikov
Ya. A. Kholodov
Alexander Gasnikov
+ FTPipeHD: A Fault-Tolerant Pipeline-Parallel Distributed Training Framework for Heterogeneous Edge Devices 2021 Yuhao Chen
Qianqian Yang
Shibo He
Zhiguo Shi
Jiming Chen
+ DAPPLE: A Pipelined Data Parallel Approach for Training Large Models 2020 Shiqing Fan
Yi Rong
Meng Chen
Zongyan Cao
Siyu Wang
Zhen Zheng
Chuan Wu
Guoping Long
Jun Yang
Lixue Xia
+ PDF Chat Automatic Graph Partitioning for Very Large-scale Deep Learning 2021 Masahiro Tanaka
Kenjiro Taura
Toshihiro Hanawa
Kentaro Torisawa

Works Cited by This (18)

Action Title Year Authors
+ Very Deep Convolutional Networks for Large-Scale Image Recognition 2014 Karen Simonyan
Andrew Zisserman
+ Distilling the Knowledge in a Neural Network 2015 Geoffrey E. Hinton
Oriol Vinyals
Jay B. Dean
+ A Convolutional Neural Network for Modelling Sentences 2014 Nal Kalchbrenner
Edward Grefenstette
Phil Blunsom
+ PDF Chat Rethinking the Inception Architecture for Computer Vision 2016 Christian Szegedy
Vincent Vanhoucke
Sergey Ioffe
Jon Shlens
Zbigniew Wojna
+ PDF Chat Deep Residual Learning for Image Recognition 2016 Kaiming He
Xiangyu Zhang
Shaoqing Ren
Jian Sun
+ Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation 2016 Yonghui Wu
Mike Schuster
Zhifeng Chen
Quoc V. Le
Mohammad Norouzi
Wolfgang Macherey
Maxim Krikun
Yuan Cao
Qin Gao
Klaus Macherey
+ Device Placement Optimization with Reinforcement Learning 2017 Azalia Mirhoseini
Hieu Pham
Quoc V. Le
Benoit Steiner
Rasmus Larsen
Yuefeng Zhou
Naveen Kumar
Mohammad Norouzi
Samy Bengio
Jeff Dean
+ AMPNet: Asynchronous Model-Parallel Training for Dynamic Neural Networks 2017 Alexander L. Gaunt
Matthew Johnson
Maik Riechert
Daniel Tarlow
Ryota Tomioka
Dimitrios Vytiniotis
Sam Webster
+ Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis 2018 Tal Ben‐Nun
Torsten Hoefler
+ PipeDream: Fast and Efficient Pipeline Parallel DNN Training. 2018 Aaron Harlap
Deepak Narayanan
Amar Phanishayee
Vivek Seshadri
Nikhil R. Devanur
Gregory R. Ganger
Phillip B. Gibbons