Cliff Young

Follow

Generating author description...

All published works
Action Title Year Authors
+ TPU v4: An Optically Reconfigurable Supercomputer for Machine Learning with Hardware Support for Embeddings 2023 Norman P. Jouppi
George Thomas Kurian
Sheng Li
Peter Ma
Rahul Nagarajan
Lifeng Nai
Nishant Patil
Suvinay Subramanian
Andy Swing
Brian Towles
+ TPU v4: An Optically Reconfigurable Supercomputer for Machine Learning with Hardware Support for Embeddings 2023 Norman P. Jouppi
George Thomas Kurian
Sheng Li
Peter Ma
Rahul Nagarajan
Lifeng Nai
Nishant Patil
Suvinay Subramanian
Andy Swing
Brian Towles
+ MegaBlocks: Efficient Sparse Training with Mixture-of-Experts 2022 Trevor Gale
Deepak Narayanan
Cliff Young
Matei Zaharia
+ Exploring the limits of Concurrency in ML Training on Google TPUs 2020 Sameer Kumar
James Bradbury
Cliff Young
Yu Emma Wang
Anselm Levskaya
Blake A. Hechtman
Dehao Chen
HyoukJoong Lee
Mehmet Deveci
Naveen Kumar
+ PDF Chat Sparse GPU Kernels for Deep Learning 2020 Trevor Gale
Matei Zaharia
Cliff Young
Erich Elsen
+ PDF Chat Bit-Parallel Vector Composability for Neural Acceleration 2020 Soroush Ghodrati
Hardik Sharma
Cliff Young
Nam Sung Kim
Hadi Esmaeilzadeh
+ Sparse GPU Kernels for Deep Learning 2020 Trevor Gale
Matei Zaharia
Cliff Young
Erich Elsen
+ Exploring the limits of Concurrency in ML Training on Google TPUs 2020 Sameer Kumar
Yu Wang
Cliff Young
James T. Bradbury
Naveen Kumar
Dehao Chen
Andy Swing
+ Sparse GPU Kernels for Deep Learning 2020 Trevor Gale
Matei Zaharia
Cliff Young
Erich Elsen
+ Bit-Parallel Vector Composability for Neural Acceleration 2020 Soroush Ghodrati
Hardik Sharma
Cliff Young
Nam Sung Kim
Hadi Esmaeilzadeh
+ MLPerf Training Benchmark 2019 Peter Mattson
Christine Cheng
Cody Coleman
Greg Diamos
Paulius Micikevicius
David E. Patterson
Hanlin Tang
Gu-Yeon Wei
Peter Bailis
Victor Bittorf
+ Mesh-TensorFlow: Deep Learning for Supercomputers 2018 Noam Shazeer
Youlong Cheng
Niki Parmar
Dustin Tran
Ashish Vaswani
Penporn Koanantakool
Peter Hawkins
HyoukJoong Lee
Mingsheng Hong
Cliff Young
+ Mesh-TensorFlow: Deep Learning for Supercomputers 2018 Noam Shazeer
Youlong Cheng
Niki Parmar
Dustin Tran
Ashish Vaswani
Penporn Koanantakool
Peter Hawkins
HyoukJoong Lee
Mingsheng Hong
Cliff Young
+ PDF Chat In-Datacenter Performance Analysis of a Tensor Processing Unit 2017 Norman P. Jouppi
Cliff Young
Nishant Patil
David A. Patterson
Gaurav Agrawal
Raminder Bajwa
S. C. Bates
Suresh Bhatia
Nan Boden
Al Borchers
+ In-Datacenter Performance Analysis of a Tensor Processing Unit 2017 Norman P. Jouppi
Cliff Young
Nishant Patil
David A. Patterson
Gaurav Agrawal
Raminder Bajwa
S. C. Bates
Suresh Bhatia
Nan Boden
Al Borchers
+ In-Datacenter Performance Analysis of a Tensor Processing Unit 2017 Norman P. Jouppi
Cliff Young
Nishant Patil
David A. Patterson
Gaurav Agrawal
Raminder Bajwa
Sarah Bates
Suresh K. Bhatia
Nan Boden
Al Borchers
+ Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation 2016 Yonghui Wu
Mike Schuster
Zhifeng Chen
Quoc V. Le
Mohammad Norouzi
Wolfgang Macherey
Maxim Krikun
Yuan Cao
Qin Gao
Klaus Macherey
+ Asynchronous Iterative Algorithms 2011 Rajesh K. Karmani
Gul Agha
Mark S. Squillante
Joel Seiferas
Marian Brezina
Jonathan Hu
Ray Tuminaro
Peter Sanders
Jesper Larsson Träffe
Robert A. Geijn
+ Drawing Drapery from Head to Toe 2007 Cliff Young
+ Statistical Analysis of Logic Circuit Performance in Digital Systems 1961 E. Nußbaum
E. Irland
Cliff Young
Common Coauthors
Commonly Cited References
Action Title Year Authors # of times referenced
+ PDF Chat ImageNet Large Scale Visual Recognition Challenge 2015 Olga Russakovsky
Jia Deng
Hao Su
Jonathan Krause
Sanjeev Satheesh
Sean Ma
Zhiheng Huang
Andrej Karpathy
Aditya Khosla
Michael S. Bernstein
5
+ Attention is All you Need 2017 Ashish Vaswani
Noam Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan N. Gomez
Łukasz Kaiser
Illia Polosukhin
4
+ PDF Chat Deep Residual Learning for Image Recognition 2016 Kaiming He
Xiangyu Zhang
Shaoqing Ren
Jian Sun
4
+ Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation 2016 Yonghui Wu
Mike Schuster
Zhifeng Chen
Quoc V. Le
Mohammad Norouzi
Wolfgang Macherey
Maxim Krikun
Yuan Cao
Qin Gao
Klaus Macherey
4
+ cuDNN: Efficient Primitives for Deep Learning 2014 Sharan Chetlur
Cliff Woolley
Philippe Vandermersch
Jonathan Cohen
John Tran
Bryan Catanzaro
Evan Shelhamer
3
+ PDF Chat Identity Mappings in Deep Residual Networks 2016 Kaiming He
Xiangyu Zhang
Shaoqing Ren
Jian Sun
3
+ PDF Chat Fathom: reference workloads for modern deep learning methods 2016 Robert Adolf
Saketh Rama
Brandon Reagen
Gu-Yeon Wei
David Brooks
3
+ Mixed Precision Training 2017 Paulius Micikevicius
Sharan Narang
Jonah Alben
Gregory Diamos
Erich Elsen
David García
Boris Ginsburg
Michael Houston
Oleksii Kuchaiev
Ganesh Venkatesh
3
+ Enabling Sparse Winograd Convolution by Native Pruning 2017 Sheng R. Li
Jongsoo Park
Ping Tang
2
+ Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour 2017 Priya Goyal
Piotr Dollár
Ross Girshick
Pieter Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
2
+ In-Datacenter Performance Analysis of a Tensor Processing Unit 2017 Norman P. Jouppi
Cliff Young
Nishant Patil
David A. Patterson
Gaurav Agrawal
Raminder Bajwa
S. C. Bates
Suresh Bhatia
Nan Boden
Al Borchers
2
+ Learning Sparse Neural Networks through $L_0$ Regularization 2017 Christos Louizos
Max Welling
Diederik P. Kingma
2
+ PDF Chat Fast Algorithms for Convolutional Neural Networks 2016 Andrew Lavin
Scott Gray
2
+ PDF Chat EIE: Efficient Inference Engine on Compressed Deep Neural Network 2016 Song Han
Xingyu Liu
Huizi Mao
Jing Pu
Ardavan Pedram
Mark Horowitz
William J. Dally
2
+ Sampled Dense Matrix Multiplication for High-Performance Machine Learning 2018 Israt Nisa
Aravind Sukumaran-Rajam
Süreyya Emre Kurt
Changwan Hong
P. Sadayappan
2
+ Transformer-XL: Attentive Language Models beyond a Fixed-Length Context 2019 Zihang Dai
Zhilin Yang
Yiming Yang
Jaime Carbonell
Quoc V. Le
Ruslan Salakhutdinov
2
+ PDF Chat In-Datacenter Performance Analysis of a Tensor Processing Unit 2017 Norman P. Jouppi
Cliff Young
Nishant Patil
David A. Patterson
Gaurav Agrawal
Raminder Bajwa
S. C. Bates
Suresh Bhatia
Nan Boden
Al Borchers
2
+ PDF Chat MobileNetV2: Inverted Residuals and Linear Bottlenecks 2018 Mark Sandler
Andrew Howard
Menglong Zhu
Andrey Zhmoginov
Liang-Chieh Chen
2
+ Adam: A Method for Stochastic Optimization 2014 Diederik P. Kingma
Jimmy Ba
2
+ EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks 2019 Mingxing Tan
Quoc V. Le
2
+ Sparse Persistent RNNs: Squeezing Large Recurrent Networks On-Chip 2018 Feiwen Zhu
Jeff Pool
Michael Andersch
Jeremy Appleyard
Fung Xie
2
+ Exploring Sparsity in Recurrent Neural Networks 2017 Sharan Narang
Gregory Diamos
Shubho Sengupta
Erich Elsen
2
+ Large Batch Training of Convolutional Networks 2017 Yang You
Igor Gitman
Boris Ginsburg
2
+ PDF Chat Efficient Content-Based Sparse Attention with Routing Transformers 2021 Aurko Roy
Mohammad Saffar
Ashish Vaswani
David Grangier
2
+ Generating Long Sequences with Sparse Transformers. 2019 Rewon Child
Scott Gray
Alec Radford
Ilya Sutskever
2
+ MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications 2017 Andrew Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
Marco Andreetto
Hartwig Adam
2
+ The State of Sparsity in Deep Neural Networks 2019 Trevor Gale
Erich Elsen
Sara Hooker
2
+ Deep Learning Recommendation Model for Personalization and Recommendation Systems 2019 Maxim Naumov
Dheevatsa Mudigere
Hao-Jun Michael Shi
Jianyu Huang
Narayanan Sundaraman
Jongsoo Park
Xiaodong Wang
Udit Gupta
Carole-Jean Wu
Alisson G. Azzolini
2
+ PDF Chat Balanced Sparsity for Efficient DNN Inference on GPU 2019 Zhuliang Yao
Shi-Jie Cao
Wencong Xiao
Chen Zhang
Lanshun Nie
2
+ TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems 2016 Martı́n Abadi
Ashish Agarwal
Paul Barham
Eugene Brevdo
Zhifeng Chen
Craig Citro
Gregory S. Corrado
Andy Davis
Jay B. Dean
Matthieu Devin
2
+ HiCOO: Hierarchical Storage of Sparse Tensors 2018 Jiajia Li
Jimeng Sun
Richard Vuduc
2
+ Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift 2015 Sergey Ioffe
Christian Szegedy
2
+ To prune, or not to prune: exploring the efficacy of pruning for model compression 2017 Michael Zhu
Suyog Gupta
2
+ Block-Sparse Recurrent Neural Networks 2017 Sharan Narang
Eric Undersander
Gregory Diamos
2
+ PDF Chat Design Principles for Sparse Matrix Multiplication on the GPU 2018 Carl Yang
Aydın Buluç
John D. Owens
2
+ Exploring the Limits of Language Modeling 2016 Rafał Józefowicz
Oriol Vinyals
Mike Schuster
Noam Shazeer
Yonghui Wu
1
+ PDF Chat A chaotic asynchronous algorithm for computing the fixed point of a nonnegative matrix of unit spectral radius 1986 Boris D. Lubachevsky
Debasis Mitra
1
+ MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems 2015 Tianqi Chen
Mu Li
Yutian Li
Min Lin
Naiyan Wang
Minjie Wang
Tianjun Xiao
Bing Xu
Chiyuan Zhang
Zheng Zhang
1
+ Asynchronous two-stage iterative methods 1994 Andreas Frommer
Daniel B. Szyld
1
+ Coupling dynamic load balancing with asynchronism in iterative algorithms on the computational grid 2004 Jacques M. Bahi
Sylvain Contassot‐Vivier
Raphaël Couturier
1
+ Reliability Analysis Techniques 1960 Charles A. Krohn
1
+ On asynchronous iterations 2000 Andreas Frommer
Daniel B. Szyld
1
+ PDF Chat A new class of asynchronous iterative algorithms with order intervals 1998 J.C. Miellou
Didier El Baz
Pierre Spitéri
1
+ Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation 2014 Kyunghyun Cho
Bart van Merriënboer
Çaǧlar Gülçehre
Dzmitry Bahdanau
Fethi Bougares
Holger Schwenk
Yoshua Bengio
1
+ WRPN: Wide Reduced-Precision Networks 2017 Asit K. Mishra
Eriko Nurvitadhi
Jeffrey Cook
Debbie Marr
1
+ PDF Chat Deep Interest Network for Click-Through Rate Prediction 2018 Guorui Zhou
Xiaoqiang Zhu
Chenru Song
Ying Fan
Zhu Han
Xiao Ma
Yanghui Yan
Junqi Jin
Han Li
Kun Gai
1
+ PDF Chat Distributed asynchronous computation of fixed points 1983 Dimitri P. Bertsekas
1
+ PDF Chat Asynchronous Iterative Algorithms with Flexible Communication for Nonlinear Network Flow Problems 1996 Didier El Baz
P. Spitéri
J.C. Miellou
D. Gazen
1
+ Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift 2015 Sergey Ioffe
Christian Szegedy
1
+ None 2000 Daniel B. Szyld
Jianjun Xu
1