Samuel Smith

Generating author description...

All published works

Action	Title	Year	Authors
+ PDF Chat	NTIRE 2024 Challenge on Image Super-Resolution ($\times$4): Methods and Results	2024	Zheng Chen Zongwei Wu Eduard Zamfir Kai Zhang Yulun Zhang Radu Timofte Xiaokang Yang Hongyuan Yu Cheng Wan Yuxin Hong
+ PDF Chat	RecurrentGemma: Moving Past Transformers for Efficient Open Language Models	2024	Aleksandar Botev Soham De Samuel Smith Anushan Fernando George-Cristian Muraru Ruba Haroun Leonard Berrada Razvan Pascanu Pier Giuseppe Sessa Robert Dadashi
+ PDF Chat	Gemma: Open Models Based on Gemini Research and Technology	2024	Gemma Team Thomas Mesnard Cassidy Hardin Robert Dadashi Surya Bhupatiraju Shreya Pathak Laurent Sifre Morgane Rivière Mihir Kale Juliette Love
+	Deep Transformers without Shortcuts: Modifying Self-attention for Faithful Signal Propagation	2023	Bobby He James Martens Guodong Zhang Aleksandar Botev Andrew Brock Samuel Smith Yee Whye Teh
+	Differentially Private Diffusion Models Generate Useful Synthetic Images	2023	Sahra Ghalebikesabi Leonard Berrada Sven Gowal Sofia Ira Ktena Robert Stanforth Jamie Hayes Soham De Samuel Smith Olivia Wiles Borja Balle
+	Resurrecting Recurrent Neural Networks for Long Sequences	2023	Antonio Orvieto Samuel Smith Albert Gu Anushan Fernando Çaǧlar Gülçehre Razvan Pascanu Soham De
+	On the Universality of Linear Recurrences Followed by Nonlinear Projections	2023	Antonio Orvieto Soham De Çaǧlar Gülçehre Razvan Pascanu Samuel Smith
+	Unlocking Accuracy and Fairness in Differentially Private Image Classification	2023	Leonard Berrada Soham De Judy Hanwen Shen Jamie Hayes Robert Stanforth David Stutz Pushmeet Kohli Samuel Smith Borja Balle
+	ConvNets Match Vision Transformers at Scale	2023	Samuel Smith Andrew Brock Leonard Berrada Soham De
+	Unlocking High-Accuracy Differentially Private Image Classification through Scale	2022	Soham De Leonard Berrada Jamie Hayes Samuel Smith Borja Balle
+	A study on the plasticity of neural networks	2021	Tudor Berariu Wojciech Marian Czarnecki Soham De Jörg Bornschein Samuel Smith Razvan Pascanu Claudia Clopath
+	Drawing Multiple Augmentation Samples Per Image During Training Efficiently Decreases Test Error.	2021	Stanislav Fort Andrew Brock Razvan Pascanu Soham De Samuel Smith
+	On the Origin of Implicit Regularization in Stochastic Gradient Descent	2021	Samuel Smith Benoît Dherin David G. T. Barrett Soham De
+ PDF Chat	On the Origin of Implicit Regularization in Stochastic Gradient Descent	2021	Samuel Smith Benoît Dherin David G. T. Barrett Soham De
+	Characterizing signal propagation to close the performance gap in unnormalized ResNets	2021	Andrew Brock Soham De Samuel Smith
+	On the Origin of Implicit Regularization in Stochastic Gradient Descent	2021	Samuel Smith Benoît Dherin David G. T. Barrett Soham De
+	High-Performance Large-Scale Image Recognition Without Normalization	2021	Andrew Brock Soham De Samuel Smith Karen Simonyan
+	Drawing Multiple Augmentation Samples Per Image During Training Efficiently Decreases Test Error	2021	Stanislav Fort Andrew Brock Razvan Pascanu Soham De Samuel Smith
+	Characterizing signal propagation to close the performance gap in unnormalized ResNets	2021	Andrew Brock Soham De Samuel Smith
+	BYOL works even without batch statistics.	2020	Pierre H. Richemond Jean-Bastien Grill Florent Altché Corentin Tallec Florian Strub Andrew Brock Samuel Smith Soham De Razvan Pascanu Bilal Piot
+	Batch Normalization Biases Deep Residual Networks Towards Shallow Paths	2020	Soham De Samuel Smith
+	Batch Normalization Biases Residual Blocks Towards the Identity Function in Deep Networks	2020	Soham De Samuel Smith
+	On the Generalization Benefit of Noise in Stochastic Gradient Descent	2020	Samuel Smith Erich Elsen Soham De
+	The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study	2019	Daniel Park Jascha Sohl‐Dickstein Quoc V. Le Samuel Smith
+	Stochastic natural gradient descent draws posterior samples in function space	2018	Samuel Smith Daniel Duckworth Semon Rezchikov Quoc V. Le Jascha Sohl‐Dickstein
+	Don't decay the learning rate, increase the batch size	2018	Samuel Smith Pieter-Jan Kindermans Chris Ying Quoc V. Le
+	Decoding Decoders: Finding Optimal Representation Spaces for Unsupervised Similarity Tasks	2018	В. К. Железняк Dan Busbridge April Shen Samuel Smith Nils Hammerla
+	Decoding Decoders: Finding Optimal Representation Spaces for Unsupervised Similarity Tasks	2018	В. К. Железняк Dan Busbridge April Shen Samuel Smith Nils Hammerla
+	Stochastic natural gradient descent draws posterior samples in function space	2018	Samuel Smith Daniel Duckworth Semon Rezchikov Quoc V. Le Jascha Sohl‐Dickstein
+	A Bayesian Perspective on Generalization and Stochastic Gradient Descent	2017	Samuel Smith Quoc V. Le
+	Understanding Generalization and Stochastic Gradient Descent	2017	Samuel Smith Quoc V. Le
+	Offline bilingual word vectors, orthogonal transformations and the inverted softmax	2017	Samuel Smith David H. P. Turban Steven Hamblin Nils Hammerla
+	Energy Efficient Dissociation of Excitons to Free Charges	2017	Maxim Tabachnyk Samuel Smith Leah R. Weiss Aditya Sadhanala Alex W. Chin Richard H. Friend Akshay Rao
+	Don't Decay the Learning Rate, Increase the Batch Size	2017	Samuel Smith Pieter-Jan Kindermans Chris Ying Quoc V. Le
+	Offline bilingual word vectors, orthogonal transformations and the inverted softmax	2017	Samuel Smith David H. P. Turban Steven Hamblin Nils Hammerla
+	A Bayesian Perspective on Generalization and Stochastic Gradient Descent	2017	Samuel Smith Quoc V. Le
+	Monte Carlo Sort for unreliable human comparisons	2016	Samuel Smith
+	Monte Carlo Sort for unreliable human comparisons	2016	Samuel Smith
+ PDF Chat	Phonon-assisted ultrafast charge separation in the PCBM band structure	2015	Samuel Smith Alex W. Chin
+	Disorder in the spectral function of a qubit ensemble	2015	Samuel Smith Alex W. Chin
+ PDF Chat	Ultrafast charge separation and nongeminate electron–hole recombination in organic photovoltaics	2014	Samuel Smith Alex W. Chin

Common Coauthors

Coauthor	Papers Together
Soham De	21
Andrew Brock	8
Quoc V. Le	8
Razvan Pascanu	7
Leonard Berrada	5
Alex W. Chin	4
Nils Hammerla	4
Borja Balle	3
Benoît Dherin	3
Jascha Sohl‐Dickstein	3
Jamie Hayes	3
David G. T. Barrett	3
Ludovic Peran	2
Olivier Bachem	2
Juliette Love	2
Noah Fiedel	2
Surya Bhupatiraju	2
Johan Ferret	2
Sertan Girgin	2
Anushan Fernando	2
Cassidy Hardin	2
April Shen	2
Thomas Mesnard	2
Stanislav Fort	2
Sebastian Borgeaud	2
Zoubin Ghahramani	2
Mihir Kale	2
David H. P. Turban	2
Koray Kavukcuoglu	2
Tris Warkentin	2
Yee Whye Teh	2
Antonia Paterson	2
Çaǧlar Gülçehre	2
Chris Ying	2
Pier Giuseppe Sessa	2
Robert Dadashi	2
Evan Senter	2
Pieter-Jan Kindermans	2
Clément Farabet	2
Daniel Duckworth	2
Dan Busbridge	2
Laurent Sifre	2
Semon Rezchikov	2
Demis Hassabis	2
Aleksandar Botev	2
В. К. Железняк	2
Léonard Hussenot	2
Antonio Orvieto	2
Kathleen Kenealy	2
Steven Hamblin	2

Commonly Cited References

Action	Title	Year	Authors	# of times referenced
+	Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour	2017	Priya Goyal Piotr Dollár Ross Girshick Pieter Noordhuis Lukasz Wesolowski Aapo Kyrola Andrew Tulloch Yangqing Jia Kaiming He	11
+	Stochastic Gradient Descent as Approximate Bayesian Inference	2017	Stephan Mandt Matthew D. Hoffman David M. Blei	7
+	Three Factors Influencing Minima in SGD	2017	Stanisław Jastrzȩbski Zachary Kenton Devansh Arpit Nicolas Ballas Asja Fischer Yoshua Bengio Amos Storkey	6
+ PDF Chat	Deep Residual Learning for Image Recognition	2016	Kaiming He Xiangyu Zhang Shaoqing Ren Jian Sun	6
+	An Empirical Model of Large-Batch Training	2018	Sam McCandlish Jared Kaplan Dario Amodei OpenAI Dota Team	5
+	Measuring the Effects of Data Parallelism on Neural Network Training	2018	Christopher J. Shallue Jaehoon Lee Joseph M. Antognini Jascha Sohl‐Dickstein Roy Frostig George E. Dahl	5
+ PDF Chat	Identity Mappings in Deep Residual Networks	2016	Kaiming He Xiangyu Zhang Shaoqing Ren Jian Sun	4
+ PDF Chat	Stochastic Gradient Descent Performs Variational Inference, Converges to Limit Cycles for Deep Networks	2018	Pratik Chaudhari Stefano Soatto	4
+	Don't decay the learning rate, increase the batch size	2018	Samuel Smith Pieter-Jan Kindermans Chris Ying Quoc V. Le	4
+	Bayesian Learning via Stochastic Gradient Langevin Dynamics	2011	Max Welling Yee Whye Teh	4
+	Wide Residual Networks	2016	Sergey Zagoruyko Nikos Komodakis	4
+ PDF Chat	Momentum Contrast for Unsupervised Visual Representation Learning	2020	Kaiming He Haoqi Fan Yuxin Wu Saining Xie Ross Girshick	4
+	Large Batch Training of Convolutional Networks	2017	Yang You Igor Gitman Boris Ginsburg	3
+	Some methods of speeding up the convergence of iteration methods	1964	B. T. Polyak	3
+ PDF Chat	ImageNet Large Scale Visual Recognition Challenge	2015	Olga Russakovsky Jia Deng Hao Su Jonathan Krause Sanjeev Satheesh Sean Ma Zhiheng Huang Andrej Karpathy Aditya Khosla Michael S. Bernstein	3
+	Instance Normalization: The Missing Ingredient for Fast Stylization	2016	Dmitry Ulyanov Andrea Vedaldi Victor Lempitsky	3
+ PDF Chat	Deep Networks with Stochastic Depth	2016	Gao Huang Yu Sun Zhuang Liu Daniel Sedra Kilian Q. Weinberger	3
+	Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift	2015	Sergey Ioffe Christian Szegedy	3
+ PDF Chat	A Tail-Index Analysis of Stochastic Gradient Noise in Deep Neural Networks	2019	Umut Şimşekli Levent Sagun Mert Gürbüzbalaban	3
+	Wide Residual Networks	2016	Sergey Zagoruyko Nikos Komodakis	3
+	SGDR: Stochastic Gradient Descent with Warm Restarts	2016	Ilya Loshchilov Frank Hutter	3
+	Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift	2015	Sergey Ioffe Christian Szegedy	3
+	Group Normalization	2018	Yuxin Wu Kaiming He	3
+	The Break-Even Point on Optimization Trajectories of Deep Neural Networks	2020	Stanisław Jastrzȩbski Maciej Szymczak Stanislav Fort Devansh Arpit Jacek Tabor Kyunghyun Cho Krzysztof J. Geras	3
+ PDF Chat	Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification	2015	Kaiming He Xiangyu Zhang Shaoqing Ren Jian Sun	3
+ PDF Chat	CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features	2019	Sangdoo Yun Dongyoon Han Sanghyuk Chun Seong Joon Oh Youngjoon Yoo Junsuk Choe	3
+	Stochastic modified equations and adaptive stochastic gradient algorithms	2015	Qianxiao Li Cheng Tai E Weinan	3
+	The Marginal Value of Adaptive Gradient Methods in Machine Learning	2017	Ashia C. Wilson Rebecca Roelofs Mitchell Stern Nathan Srebro Benjamin Recht	3
+	Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms	2017	Xiao Han Kashif Rasul Roland Vollgraf	3
+	Order and Chaos: NTK views on DNN Normalization, Checkerboard and Boundary Artifacts	2019	Arthur Paul Jacot Franck Gabriel François Gaston Ged Clément Hongler	3
+	On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima	2016	Nitish Shirish Keskar Dheevatsa Mudigere Jorge Nocedal Mikhail Smelyanskiy Ping Tang	3
+ PDF Chat	Rethinking the Inception Architecture for Computer Vision	2016	Christian Szegedy Vincent Vanhoucke Sergey Ioffe Jon Shlens Zbigniew Wojna	3
+	mixup: Beyond Empirical Risk Minimization	2017	Hongyi Zhang Moustapha Cissé Yann Dauphin David López-Paz	2
+ PDF Chat	Squeeze-and-Excitation Networks	2019	Jie Hu Li Shen Samuel Albanie Gang Sun Enhua Wu	2
+	Optimizing Neural Networks with Kronecker-factored Approximate Curvature	2015	James Martens Roger Grosse	2
+ PDF Chat	Ultrafast Charge Separation in Organic Photovoltaics Enhanced by Charge Delocalization and Vibronically Hot Exciton Dissociation	2013	Hiroyuki Tamura Irène Burghardt	2
+ PDF Chat	In-place Activated BatchNorm for Memory-Optimized Training of DNNs	2018	Samuel Rota Bulò Lorenzo Porzi Peter Kontschieder	2
+	Normalization Propagation: A Parametric Technique for Removing Internal Covariate Shift in Deep Networks	2016	Devansh Arpit Yingbo Zhou Bhargava Urala Kota Venu Govindaraju	2
+ PDF Chat	MobileNetV2: Inverted Residuals and Linear Bottlenecks	2018	Mark Sandler Andrew Howard Menglong Zhu Andrey Zhmoginov Liang-Chieh Chen	2
+	EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks	2019	Mingxing Tan Quoc V. Le	2
+	Freeze and Chaos for DNNs: an NTK view of Batch Normalization, Checkerboard and Boundary Effects.	2019	Arthur Paul Jacot Franck Gabriel Clément Hongler	2
+	The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study	2019	Daniel Park Jascha Sohl‐Dickstein Quoc V. Le Samuel Smith	2
+ PDF Chat	Bag of Tricks for Image Classification with Convolutional Neural Networks	2019	Tong He Zhi Zhang Hang Zhang Zhongyue Zhang Junyuan Xie Mu Li	2
+ PDF Chat	Aggregated Residual Transformations for Deep Neural Networks	2017	Saining Xie Ross Girshick Piotr Dollár Zhuowen Tu Kaiming He	2
+ PDF Chat	The Role of Driving Energy and Delocalized States for Charge Separation in Organic Semiconductors	2012	Artem A. Bakulin Akshay Rao Vlad G. Pavelyev P. H. M. van Loosdrecht Maxim S. Pshenichnikov Dorota Niedziałek Jérôme Cornil David Beljonne Richard H. Friend	2
+	Fixup Initialization: Residual Learning Without Normalization.	2019	Hongyi Zhang Yann Dauphin Tengyu Ma	2
+	Bridging Nonlinearities and Stochastic Regularizers with Gaussian Error Linear Units	2016	Dan Hendrycks Kevin Gimpel	2
+	Gaussian Error Linear Units (GELUs)	2016	Dan Hendrycks Kevin Gimpel	2
+ PDF Chat	Noise-induced quantum coherence drives photo-carrier generation dynamics at polymeric semiconductor heterojunctions	2014	Eric R. Bittner Carlos Silva	2
+	Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning	2016	Christian Szegedy Sergey Ioffe Vincent Vanhoucke Alexander A. Alemi	2