BaPipe: Exploration of Balanced Pipeline Parallelism for DNN Training

Type: Preprint

Publication Date: 2020-01-01

Citations: 5

DOI: https://doi.org/10.48550/arxiv.2012.12544

Locations

  • arXiv (Cornell University) - View
  • DataCite API - View

Similar Works

Action Title Year Authors
+ PDF Chat HyPar: Towards Hybrid Parallelism for Deep Learning Accelerator Array 2019 Linghao Song
Jiachen Mao
Youwei Zhuo
Xuehai Qian
Hai Li
Yiran Chen
+ Merak: An Efficient Distributed DNN Training Framework with Automated 3D Parallelism for Giant Foundation Models 2022 Zhiquan Lai
Shengwei Li
Xudong Tang
Keshi Ge
Weijie Liu
Yabo Duan
Linbo Qiao
Dongsheng Li
+ DAPPLE: A Pipelined Data Parallel Approach for Training Large Models 2020 Shiqing Fan
Yi Rong
Meng Chen
Zongyan Cao
Siyu Wang
Zhen Zheng
Chuan Wu
Guoping Long
Jun Yang
Lixue Xia
+ PDF Chat BitPipe: Bidirectional Interleaved Pipeline Parallelism for Accelerating Large Models Training 2024 Houming Wu
Ling Chen
Wenjie Yu
+ Beyond Data and Model Parallelism for Deep Neural Networks 2018 Zhihao Jia
Matei Zaharia
Alex Aiken
+ Beyond Data and Model Parallelism for Deep Neural Networks 2018 Zhihao Jia
Matei Zaharia
Alex Aiken
+ HyPar: Towards Hybrid Parallelism for Deep Learning Accelerator Array 2019 Linghao Song
Jiachen Mao
Youwei Zhuo
Xuehai Qian
Hai Li
Yiran Chen
+ PipeDream: Fast and Efficient Pipeline Parallel DNN Training 2018 Aaron Harlap
Deepak Narayanan
Amar Phanishayee
Vivek Seshadri
Nikhil R. Devanur
Greg Ganger
Phil Gibbons
+ Efficient and Robust Parallel DNN Training through Model Parallelism on Multi-GPU Platform 2018 Chi‐Chung Chen
Chia-Lin Yang
Hsiang-Yun Cheng
+ PipeDream: Fast and Efficient Pipeline Parallel DNN Training. 2018 Aaron Harlap
Deepak Narayanan
Amar Phanishayee
Vivek Seshadri
Nikhil R. Devanur
Gregory R. Ganger
Phillip B. Gibbons
+ XPipe: Efficient Pipeline Model Parallelism for Multi-GPU DNN Training 2019 Lei Guan
Wotao Yin
Dongsheng Li
Xicheng Lu
+ PDF Chat TiMePReSt: Time and Memory Efficient Pipeline Parallel DNN Training with Removed Staleness 2024 A. Dutta
Nabendu Chaki
Rajat K. De
+ HetPipe: Enabling Large DNN Training on (Whimpy) Heterogeneous GPU Clusters through Integration of Pipelined Model Parallelism and Data Parallelism 2020 Jay Park
Gyeongchan Yun
Chang M. Yi
Nguyen T. Nguyen
Seungmin Lee
Jaesik Choi
Sam H. Noh
Young-ri Choi
+ HetPipe: Enabling Large DNN Training on (Whimpy) Heterogeneous GPU Clusters through Integration of Pipelined Model Parallelism and Data Parallelism 2020 Jay Park
Gyeongchan Yun
Chang M. Yi
Nguyen T. Nguyen
Seungmin Lee
Jaesik Choi
Sam H. Noh
Young-ri Choi
+ PDF Chat An Oracle for Guiding Large-Scale Model/Hybrid Parallel Training of Convolutional Neural Networks 2021 Albert Njoroge Kahira
Truong Thao Nguyen
Leonardo Bautista-Gomez
Ryousei Takano
Rosa M. BadĂ­a
Mohamed Wahib
+ PDF Chat GraphPipe: Improving Performance and Scalability of DNN Training with Graph Pipeline Parallelism 2024 Byungsoo Jeon
Mengdi Wu
Shiyi Cao
Sunghyun Kim
Sunghyun Park
Neeraj Aggarwal
Colin Unger
Daiyaan Arfeen
Peiyuan Liao
Xupeng Miao
+ FPDeep: Scalable Acceleration of CNN Training on Deeply-Pipelined FPGA Clusters 2020 Tianqi Wang
Tong Geng
Ang Li
Xi Jin
Martin C. Herbordt
+ PDF Chat Pase: Parallelization Strategies for Efficient DNN Training 2021 Venmugil Elango
+ PDF Chat Efficient Pipeline Planning for Expedited Distributed DNN Training 2022 Ziyue Luo
Xiaodong Yi
Guoping Long
Shiqing Fan
Chuan Wu
Jun Yang
Wei Lin
+ CNNLab: a Novel Parallel Framework for Neural Networks using GPU and FPGA-a Practical Study with Trade-off Analysis 2016 Maohua Zhu
Liu Liu
Chao Wang
Yuan Xie

Works Cited by This (18)

Action Title Year Authors
+ Very Deep Convolutional Networks for Large-Scale Image Recognition 2014 Karen Simonyan
Andrew Zisserman
+ PDF Chat Deep Residual Learning for Image Recognition 2016 Kaiming He
Xiangyu Zhang
Shaoqing Ren
Jian Sun
+ PDF Chat Identity Mappings in Deep Residual Networks 2016 Kaiming He
Xiangyu Zhang
Shaoqing Ren
Jian Sun
+ Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation 2016 Yonghui Wu
Mike Schuster
Zhifeng Chen
Quoc V. Le
Mohammad Norouzi
Wolfgang Macherey
Maxim Krikun
Yuan Cao
Qin Gao
Klaus Macherey
+ Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour 2017 Priya Goyal
Piotr DollĂĄr
Ross Girshick
Pieter Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
+ Exploring Hidden Dimensions in Parallelizing Convolutional Neural Networks 2018 Zhihao Jia
Sina Lin
Charles R. Qi
Alex Aiken
+ Attention is All you Need 2017 Ashish Vaswani
Noam Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan N. Gomez
Łukasz Kaiser
Illia Polosukhin
+ Dynamic Scheduling of MPI-based Distributed Deep Learning Training Jobs. 2019 Tim Capes
Vishal Raheja
Mete Kemertas
Iqbal Mohomed
+ PDF Chat Optimizing Multi-GPU Parallelization Strategies for Deep Learning Training 2019 Saptadeep Pal
Eiman Ebrahimi
Arslan Zulfiqar
Yaosheng Fu
Victor Zhang
Szymon Migacz
David Nellans
Puneet Gupta
+ PyTorch: An Imperative Style, High-Performance Deep Learning Library 2019 Adam Paszke
Sam Gross
Francisco Massa
Adam Lerer
James Bradbury
Gregory Chanan
Trevor Killeen
Zeming Lin
Natalia Gimelshein
Luca Antiga