Projects
Reading
People
Chat
SU\G
(𝔸)
/K·U
Projects
Reading
People
Chat
Sign Up
Light
Dark
System
Scaling GRPC Tensorflow on 512 nodes of Cori Supercomputer
Amrita Mathuriya
,
Thorsten Kurth
,
Vivek Rane
,
Mustafa Mustafa
,
Lei Shao
,
Debbie Bard
,
Prabhat
,
Victor W. Lee
Type:
Preprint
Publication Date:
2017-12-26
Citations:
10
View Publication
Share
Locations
arXiv (Cornell University) -
View
Similar Works
Action
Title
Year
Authors
+
Scaling GRPC Tensorflow on 512 nodes of Cori Supercomputer
2017
Amrita Mathuriya
Thorsten Kurth
Vivek Rane
Mustafa Mustafa
Lei Shao
Debbie Bard
Prabhat
Victor W. Lee
+
PDF
Chat
Throughput Prediction of Asynchronous SGD in TensorFlow
2020
Zhuojin Li
Wumo Yan
Marco Paolieri
Leana Golubchik
+
PDF
Chat
Scalable Distributed DNN Training using TensorFlow and CUDA-Aware MPI: Characterization, Designs, and Performance Evaluation
2019
Ammar Ahmad Awan
Jereon Bedorf
Ching-Hsiang Chu
Hari Subramoni
Dhabaleswar K. Panda
+
PDF
Chat
FireCaffe: Near-Linear Acceleration of Deep Neural Network Training on Compute Clusters
2016
Forrest Iandola
Matthew W. Moskewicz
Khalid Ashraf
Kurt Keutzer
+
User-transparent Distributed TensorFlow
2017
Abhinav Vishnu
Joseph Manzano
Charles Siegel
Jeff Daily
+
FireCaffe: near-linear acceleration of deep neural network training on compute clusters
2015
Forrest Iandola
Khalid Ashraf
Matthew W. Moskewicz
Kurt Keutzer
+
FireCaffe: near-linear acceleration of deep neural network training on compute clusters
2015
Forrest Iandola
Khalid Ashraf
Mattthew W. Moskewicz
Kurt Keutzer
+
PyTorch Distributed: Experiences on Accelerating Data Parallel Training
2020
Li Shen
Yanli Zhao
Rohan Varma
Omkar Salpekar
Pieter Noordhuis
Teng Li
Adam Paszke
Jeff Smith
Brian Vaughan
Pritam Damania
+
On Scale-out Deep Learning Training for Cloud and HPC
2018
Srinivas Sridharan
Karthikeyan Vaidyanathan
Dhiraj Kalamkar
Dipankar Das
Mikhail E. Smorkalov
Mikhail Shiryaev
Dheevatsa Mudigere
Naveen Mellempudi
Sasikanth Avancha
Bharat Kaul
+
MXNET-MPI: Embedding MPI parallelism in Parameter Server Task Model for scaling Deep Learning.
2018
Amith R. Mamidala
Γεώργιος Κόλλιας
Chris Ward
Fausto Artico
+
On Scale-out Deep Learning Training for Cloud and HPC.
2018
Srinivas Sridharan
Karthikeyan Vaidyanathan
Dhiraj Kalamkar
Dipankar Das
Mikhail E. Smorkalov
Mikhail Shiryaev
Dheevatsa Mudigere
Naveen Mellempudi
Sasikanth Avancha
Bharat Kaul
+
MXNET-MPI: Embedding MPI parallelism in Parameter Server Task Model for scaling Deep Learning
2018
Amith R Mamidala
Γεώργιος Κόλλιας
Chris Ward
Fausto Artico
+
HyPar-Flow: Exploiting MPI and Keras for Scalable Hybrid-Parallel DNN Training using TensorFlow
2019
Ammar Ahmad Awan
Arpan Jain
Quentin Anthony
Hari Subramoni
Dhabaleswar K. Panda
+
PDF
Chat
swCaffe: A Parallel Framework for Accelerating Deep Learning Applications on Sunway TaihuLight
2018
Liandeng Li
Jiarui Fang
Haohuan Fu
Jinlei Jiang
Wenlai Zhao
Conghui He
Xin You
Guangwen Yang
+
Image Classification at Supercomputer Scale
2018
Chris Ying
Sameer Kumar
Dehao Chen
Tao Wang
Youlong Cheng
+
PDF
Chat
TensorFlow Doing HPC
2019
Steven W. D. Chien
Stefano Markidis
Vyacheslav Olshevsky
Yaroslav Bulatov
Erwin Laure
Jeffrey S. Vetter
+
swCaffe: a Parallel Framework for Accelerating Deep Learning Applications on Sunway TaihuLight
2019
Jiarui Fang
Liandeng Li
Haohuan Fu
Jinlei Jiang
Wenlai Zhao
Conghui He
Xin You
Guangwen Yang
+
PDF
Chat
PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel
2023
Yanli Zhao
Andrew Gu
Rohan Varma
Liang Luo
Chien-Chin Huang
Min Xu
Less Wright
Hamid Shojanazeri
Myle Ott
Sam Shleifer
+
PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel
2023
Yanli Zhao
Andrew Gu
Rohan Varma
Liang Luo
Chien-Chin Huang
Min Xu
Less Wright
Hamid Shojanazeri
Myle Ott
Sam Shleifer
+
PDF
Chat
Performance Modeling and Evaluation of Distributed Deep Learning Frameworks on GPUs
2018
Shaohuai Shi
Qiang Wang
Xiaowen Chu
Works That Cite This (7)
Action
Title
Year
Authors
+
CosmoFlow: Using Deep Learning to Learn the Universe at Scale
2018
Amrita Mathuriya
Deborah Bard
P. J. Mendygral
Lawrence Meadows
James Arnemann
Lei Shao
Siyu He
Tuomas Kärnä
Daina Moise
S. J. Pennycook
+
Etalumis
2019
Atılım Güneş Baydin
Lei Shao
W. Bhimji
L. Heinrich
Lawrence Meadows
Jialin Liu
Andreas Munk
Saeid Naderiparizi
Bradley Gram-Hansen
Gilles Louppe
+
HPC AI500: A Benchmark Suite for HPC AI Systems
2019
Zihan Jiang
Wanling Gao
Lei Wang
Xingwang Xiong
Yuchen Zhang
Xu Wen
Chunjie Luo
Hainan Ye
Yunquan Zhang
Shengzhong Feng
+
MetH: A family of high-resolution and variable-shape image challenges
2019
Ferran Parés Pont
Dario García-Gasulla
Harald Servat
Jesús Labarta
Eduard Ayguadé
+
HPC AI500: Representative, Repeatable and Simple HPC AI Benchmarking
2021
Zihan Jiang
Wanling Gao
Fei Tang
Xingwang Xiong
Lei Wang
Chuanxin Lan
Chunjie Luo
Hongxiao Li
Jianfeng Zhan
+
HPC AI500: The Methodology, Tools, Roofline Performance Models, and Metrics for Benchmarking HPC AI Systems
2020
Zihan Jiang
Lei Wang
Xingwang Xiong
Wanling Gao
Chunjie Luo
Fei Tang
Chuanxin Lan
Hongxiao Li
Jianfeng Zhan
+
PDF
Chat
CosmoFlow: Using Deep Learning to Learn the Universe at Scale
2018
Amrita Mathuriya
Deborah Bard
P. J. Mendygral
Lawrence Meadows
James Arnemann
Lei Shao
Siyu He
Tuomas Kärnä
Diana Moise
S. J. Pennycook
Works Cited by This (2)
Action
Title
Year
Authors
+
TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
2016
Martı́n Abadi
Ashish Agarwal
Paul Barham
Eugene Brevdo
Zhifeng Chen
Craig Citro
Gregory S. Corrado
Andy Davis
Jay B. Dean
Matthieu Devin
+
Deep Residual Learning for Image Recognition
2015
Kaiming He
Xiangyu Zhang
Shaoqing Ren
Jian Sun