Yanghua Peng

Follow

Generating author description...

All published works
Action Title Year Authors
+ PDF Chat HybridFlow: A Flexible and Efficient RLHF Framework 2024 Guangying Sheng
Chi Zhang
Ziliang Ye
Xiang-jun Wu
Wang Zhang
Ru Zhang
Yanghua Peng
Haibin Lin
Chuansong Wu
+ PDF Chat Optimus: Accelerating Large-Scale Multi-Modal LLM Training by Bubble Exploitation 2024 Wei Feng
Yangrui Chen
Shaoyu Wang
Yanghua Peng
Haibin Lin
Mingchuan Yu
+ PDF Chat ByteCheckpoint: A Unified Checkpointing System for LLM Development 2024 Borui Wan
Mingji Han
Yiyao Sheng
Zhichao Lai
Mofan Zhang
Junda Zhang
Yanghua Peng
Haibin Lin
Xin Liu
Wu Chuan
+ PDF Chat QSync: Quantization-Minimized Synchronous Distributed Training Across Hybrid Devices 2024 Juntao Zhao
Borui Wan
Yanghua Peng
Haibin Lin
Yibo Zhu
Chuan Wu
+ PDF Chat QSync: Quantization-Minimized Synchronous Distributed Training Across Hybrid Devices 2024 Juntao Zhao
Borui Wan
Yanghua Peng
Haibin Lin
Yibo Zhu
Chuan Wu
+ PDF Chat CDMPP: A Device-Model Agnostic Framework for Latency Prediction of Tensor Programs 2024 Hanpeng Hu
Junwei Su
Juntao Zhao
Yanghua Peng
Yibo Zhu
Haibin Lin
Chuan Wu
+ PDF Chat LLM-PQ: Serving LLM on Heterogeneous Clusters with Phase-Aware Partition and Adaptive Quantization 2024 Juntao Zhao
Borui Wan
Yanghua Peng
Haibin Lin
Chuan Wu
+ PDF Chat MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs 2024 Ziheng Jiang
Haibin Lin
Yinmin Zhong
Qi Huang
Yangrui Chen
Zhi Zhang
Yanghua Peng
Xiang Li
Cong Xie
Shibiao Nong
+ CDMPP: A Device-Model Agnostic Framework for Latency Prediction of Tensor Programs 2023 Hanpeng Hu
Junwei Su
Juntao Zhao
Yanghua Peng
Yibo Zhu
Haibin Lin
Chuan Wu
+ dPRO: A Generic Profiling and Optimization System for Expediting Distributed DNN Training 2022 Hanpeng Hu
Chenyu Jiang
Yuchen Zhong
Yanghua Peng
Chuan Wu
Yibo Zhu
Haibin Lin
Chuanxiong Guo
+ PDF Chat DL2: A Deep Learning-Driven Scheduler for Deep Learning Clusters 2021 Yanghua Peng
Yixin Bao
Yangrui Chen
Chuan Wu
Chen Meng
Wei Lin
+ BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and Preprocessing 2021 Tianfeng Liu
Yangrui Chen
Dan Li
Chuan Wu
Yibo Zhu
Jun He
Yanghua Peng
Hongzheng Chen
Hongzhi Chen
Chuanxiong Guo
+ DL2: A Deep Learning-driven Scheduler for Deep Learning Clusters 2019 Yanghua Peng
Yixin Bao
Yangrui Chen
Chuan Wu
Chen Meng
Wei Lin
+ PDF Chat Online Job Scheduling in Distributed Machine Learning Clusters 2018 Yixin Bao
Yanghua Peng
Chuan Wu
Zongpeng Li
+ Online Job Scheduling in Distributed Machine Learning Clusters 2018 Yixin Bao
Yanghua Peng
Chuan Wu
Zongpeng Li
+ Dynamic Scaling of Virtualized, Distributed Service Chains: A Case Study of IMS 2017 Jingpu Duan
Chuan Wu
Franck Le
Alex X. Liu
Yanghua Peng
+ Dynamic Scaling of Virtualized, Distributed Service Chains: A Case Study of IMS 2017 Jingpu Duan
Chuan Wu
Franck Le
Alex Liu
Yanghua Peng
Common Coauthors
Commonly Cited References
Action Title Year Authors # of times referenced
+ Very Deep Convolutional Networks for Large-Scale Image Recognition 2014 Karen Simonyan
Andrew Zisserman
4
+ PDF Chat Towards Distributed Machine Learning in Shared Clusters: A Dynamically-Partitioned Approach 2017 Peng Sun
Yonggang Wen
Ta Nguyen Binh Duong
Shengen Yan
3
+ PDF Chat Deep Residual Learning for Image Recognition 2016 Kaiming He
Xiangyu Zhang
Shaoqing Ren
Jian Sun
3
+ PDF Chat Online Job Scheduling in Distributed Machine Learning Clusters 2018 Yixin Bao
Yanghua Peng
Chuan Wu
Zongpeng Li
3
+ MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems 2015 Tianqi Chen
Mu Li
Yutian Li
Min Lin
Naiyan Wang
Minjie Wang
Tianjun Xiao
Bing Xu
Chiyuan Zhang
Zheng Zhang
3
+ TensorFlow: A system for large-scale machine learning 2016 MartĹ́n Abadi
Paul Barham
Jianmin Chen
Zhifeng Chen
Andy Davis
Jay B. Dean
Matthieu Devin
Sanjay Ghemawat
Geoffrey Irving
Michael Isard
3
+ Learning scheduling algorithms for data processing clusters 2019 Hongzi Mao
Malte Schwarzkopf
Shaileshh Bojja Venkatakrishnan
Zili Meng
Mohammad Alizadeh
2
+ Device Placement Optimization with Reinforcement Learning 2017 Azalia Mirhoseini
Hieu Pham
Quoc V. Le
Benoit Steiner
Rasmus Larsen
Yuefeng Zhou
Naveen Kumar
Mohammad Norouzi
Samy Bengio
Jeff Dean
2
+ Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour 2017 Priya Goyal
Piotr DollĂĄr
Ross Girshick
Pieter Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
2
+ StarCraft II: A New Challenge for Reinforcement Learning 2017 Oriol Vinyals
Timo Ewalds
Sergey Bartunov
Petko Georgiev
Alexander Sasha Vezhnevets
Michelle Yeo
Alireza Makhzani
Heinrich KĂźttler
John Agapiou
Julian Schrittwieser
2
+ PDF Chat Speech recognition with deep recurrent neural networks 2013 Alex Graves
Abdelrahman Mohamed
Geoffrey E. Hinton
2
+ Very Deep Convolutional Networks for Large-Scale Image Recognition 2014 Karen Simonyan
Andrew Zisserman
2
+ Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift 2015 Sergey Ioffe
Christian Szegedy
2
+ PDF Chat FireCaffe: Near-Linear Acceleration of Deep Neural Network Training on Compute Clusters 2016 Forrest Iandola
Matthew W. Moskewicz
Khalid Ashraf
Kurt Keutzer
2
+ PDF Chat Experience-driven Networking: A Deep Reinforcement Learning based Approach 2018 Zhiyuan Xu
Jian Tang
Jingsong Meng
Wei Zhang
Yanzhi Wang
Chi Harold Liu
Dejun Yang
2
+ Asynchronous Methods for Deep Reinforcement Learning 2016 Volodymyr Mnih
Adrià Puigdomènech Badia
Mehdi Mirza
Alex Graves
Tim Harley
Timothy Lillicrap
David Silver
Koray Kavukcuoglu
2
+ Adam: A Method for Stochastic Optimization 2014 Diederik P. Kingma
Jimmy Ba
2
+ Neural Machine Translation by Jointly Learning to Align and Translate 2015 Dzmitry Bahdanau
Kyunghyun Cho
Yoshua Bengio
2
+ An Analysis of Transformations 1964 George E. P. Box
David R. Cox
1
+ PDF Chat TLP: A Deep Learning-Based Cost Model for Tensor Program Tuning 2023 Yi Zhai
Yu Zhang
Shuo Liu
Xiaomeng Chu
Jie Peng
Jianmin Ji
Yanyong Zhang
1
+ Convolutional Neural Networks for Sentence Classification 2014 Yoon Kim
1
+ Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift 2015 Sergey Ioffe
Christian Szegedy
1
+ Neural Machine Translation by Jointly Learning to Align and Translate 2014 Dzmitry Bahdanau
Kyunghyun Cho
Yoshua Bengio
1
+ PDF Chat Rethinking the Inception Architecture for Computer Vision 2016 Christian Szegedy
Vincent Vanhoucke
Sergey Ioffe
Jon Shlens
Zbigniew Wojna
1
+ Convex Optimization 2004 Stephen Boyd
Lieven Vandenberghe
1
+ PDF Chat Aggregated Residual Transformations for Deep Neural Networks 2017 Saining Xie
Ross Girshick
Piotr DollĂĄr
Zhuowen Tu
Kaiming He
1
+ PDF Chat Petuum: A New Platform for Distributed Machine Learning on Big Data 2015 Eric P. Xing
Qirong Ho
Wei Dai
Jin Kyu Kim
Jinliang Wei
Seunghak Lee
Xun Zheng
Pengtao Xie
Abhimanu Kumar
Yaoliang Yu
1
+ Convolutional Sequence to Sequence Learning 2017 Jonas Gehring
Michael Auli
David Grangier
Denis Yarats
Yann Dauphin
1
+ PDF Chat Robust unsupervised domain adaptation for neural networks via moment alignment 2019 Werner Zellinger
Bernhard Moser
Thomas Grubinger
Edwin Lughofer
Thomas Natschläger
Susanne Saminger‐Platz
1
+ BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding 2018 Jacob Devlin
Ming‐Wei Chang
Kenton Lee
Kristina Toutanova
1
+ PDF Chat Deep Learning for IoT Big Data and Streaming Analytics: A Survey 2018 Mehdi Mohammadi
Ala Al‐Fuqaha
Sameh Sorour
Mohsen Guizani
1
+ Deep Transfer Learning with Joint Adaptation Networks 2016 Mingsheng Long
Zhu Han
Jianmin Wang
Michael I. Jordan
1
+ PDF Chat MLPerf Inference Benchmark 2020 Vijay Janapa Reddi
Christine Cheng
David Kanter
Peter Mattson
Guenther Schmuelling
Carole-Jean Wu
Brian A. Anderson
Maximilien Breughe
Mark Charlebois
William Chou
1
+ PDF Chat Truthful Online Scheduling with Commitments 2015 Yossi Azar
Inna Kalp-Shaltiel
Brendan Lucier
Ishai Menache
Joseph Naor
Jonathan Yaniv
1
+ Shift-Robust GNNs: Overcoming the Limitations of Localized Graph Training Data 2021 Qi Zhu
Natalia Ponomareva
Jiawei Han
Bryan Perozzi
1
+ Domain Divergences: a Survey and Empirical Analysis 2020 Abhinav Ramesh Kashyap
Devamanyu Hazarika
Min‐Yen Kan
Roger Zimmermann
1
+ Priority-based Parameter Propagation for Distributed DNN Training 2019 Anand Jayarajan
Jinliang Wei
Garth A. Gibson
Alexandra Fedorova
Gennady Pekhimenko
1