+
PDF
Chat
|
ALWANN: Automatic Layer-Wise Approximation of Deep Neural Network Accelerators without Retraining
|
2019
|
Vojtěch Mrázek
Zdeněk Vašíček
Lukáš Sekanina
Muhammad Abdullah Hanif
Muhammad Shafique
|
6
|
+
|
In-Datacenter Performance Analysis of a Tensor Processing Unit
|
2017
|
Norman P. Jouppi
Cliff Young
Nishant Patil
David A. Patterson
Gaurav Agrawal
Raminder Bajwa
S. C. Bates
Suresh Bhatia
Nan Boden
Al Borchers
|
6
|
+
PDF
Chat
|
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
|
2018
|
Benoit Jacob
Skirmantas Kligys
Bo Chen
Menglong Zhu
Matthew F. Tang
Andrew Howard
Hartwig Adam
Dmitry Kalenichenko
|
4
|
+
PDF
Chat
|
In-Datacenter Performance Analysis of a Tensor Processing Unit
|
2017
|
Norman P. Jouppi
Cliff Young
Nishant Patil
David A. Patterson
Gaurav Agrawal
Raminder Bajwa
S. C. Bates
Suresh Bhatia
Nan Boden
Al Borchers
|
4
|
+
PDF
Chat
|
TFApprox: Towards a Fast Emulation of DNN Approximate Hardware Accelerators on GPU
|
2020
|
Filip Vaverka
Vojtěch Mrázek
Zdeněk Vašíček
Lukáš Sekanina
|
3
|
+
PDF
Chat
|
Control Variate Approximation for DNN Accelerators
|
2021
|
Georgios Zervakis
Ourania Spantidi
Iraklis Anagnostopoulos
Hussam Amrouch
Jörg Henkel
|
3
|
+
PDF
Chat
|
Deep Residual Learning for Image Recognition
|
2016
|
Kaiming He
Xiangyu Zhang
Shaoqing Ren
Jian Sun
|
3
|
+
|
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
|
2017
|
Andrew Howard
Menglong Zhu
Bo Chen
Dmitry Kalenichenko
Weijun Wang
Tobias Weyand
Marco Andreetto
Hartwig Adam
|
2
|
+
|
Very Deep Convolutional Networks for Large-Scale Image Recognition
|
2014
|
Karen Simonyan
Andrew Zisserman
|
2
|
+
PDF
Chat
|
Rethinking the Inception Architecture for Computer Vision
|
2016
|
Christian Szegedy
Vincent Vanhoucke
Sergey Ioffe
Jon Shlens
Zbigniew Wojna
|
2
|
+
PDF
Chat
|
ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices
|
2018
|
Xiangyu Zhang
Xinyu Zhou
Mengxiao Lin
Jian Sun
|
2
|
+
PDF
Chat
|
MobileNetV2: Inverted Residuals and Linear Bottlenecks
|
2018
|
Mark Sandler
Andrew Howard
Menglong Zhu
Andrey Zhmoginov
Liang-Chieh Chen
|
2
|
+
|
Quantizing deep convolutional networks for efficient inference: A whitepaper
|
2018
|
Raghuraman Krishnamoorthi
|
2
|
+
PDF
Chat
|
Going deeper with convolutions
|
2015
|
Christian Szegedy
Wei Liu
Yangqing Jia
Pierre Sermanet
Scott Reed
Dragomir Anguelov
Dumitru Erhan
Vincent Vanhoucke
Andrew Rabinovich
|
2
|
+
|
SCALE-Sim: Systolic CNN Accelerator Simulator
|
2018
|
Ananda Samajdar
Yuhao Zhu
Paul N. Whatmough
Matthew Mattina
Tushar Krishna
|
2
|
+
PDF
Chat
|
APQ: Joint Search for Network Architecture, Pruning and Quantization Policy
|
2020
|
Tianzhe Wang
Kuan Wang
Han Cai
Ji Lin
Zhijian Liu
Hanrui Wang
Yujun Lin
Song Han
|
1
|
+
PDF
Chat
|
Differentiable Joint Pruning and Quantization for Hardware Efficiency
|
2020
|
Ying Wang
Yadong Lu
Tijmen Blankevoort
|
1
|
+
PDF
Chat
|
Mining parametric temporal logic properties in model-based design for cyber-physical systems
|
2017
|
Bardh Hoxha
Adel Dokhanchi
Georgios Fainekos
|
1
|
+
PDF
Chat
|
Using Libraries of Approximate Circuits in Design of Hardware Accelerators of Deep Neural Networks
|
2020
|
Vojtěch Mrázek
Lukáš Sekanina
Zdeněk Vašíček
|
1
|
+
|
Position-based Scaled Gradient for Model Quantization and Pruning
|
2020
|
Jangho Kim
KiYoon Yoo
Nojun Kwak
|
1
|
+
|
PatDNN: Achieving Real-Time DNN Execution on Mobile Devices with Pattern-based Weight Pruning
|
2020
|
Wei Niu
Xiaolong Ma
Sheng Lin
Shihao Wang
Xuehai Qian
Xue Lin
Yanzhi Wang
Bin Ren
|
1
|
+
|
VecQ: Minimal Loss DNN Model Compression With Vectorized Weight Quantization
|
2020
|
Cheng Gong
Yao Chen
Ye Lu
Tao Li
Cong Hao
Deming Chen
|
1
|
+
|
Non-Structured DNN Weight Pruning—Is It Beneficial in Any Platform?
|
2021
|
Xiaolong Ma
Sheng Lin
Shaokai Ye
Zhezhi He
Linfeng Zhang
Geng Yuan
Sia Huat Tan
Zhengang Li
Deliang Fan
Xuehai Qian
|
1
|
+
PDF
Chat
|
Heterogeneous Dataflow Accelerators for Multi-DNN Workloads
|
2021
|
Hyoukjun Kwon
Liangzhen Lai
Michael Pellauer
Tushar Krishna
Yu‐Hsin Chen
Vikas Chandra
|
1
|
+
PDF
Chat
|
OPQ: Compressing Deep Neural Networks with One-shot Pruning-Quantization
|
2021
|
Peng Hu
Xi Peng
Hongyuan Zhu
Mohamed M. Sabry Aly
Jie Lin
|
1
|
+
PDF
Chat
|
Positive/Negative Approximate Multipliers for DNN Accelerators
|
2021
|
Ourania Spantidi
Georgios Zervakis
Iraklis Anagnostopoulos
Hussam Amrouch
Jörg Henkel
|
1
|
+
PDF
Chat
|
PQK: Model Compression via Pruning, Quantization, and Knowledge Distillation
|
2021
|
Jangho Kim
Simyung Chang
Nojun Kwak
|
1
|
+
|
A full-stack search technique for domain optimized deep learning accelerators
|
2022
|
Dan Zhang
Safeen Huda
Ebrahim M. Songhori
Kartik Prabhu
Quoc V. Le
Anna Goldie
Azalia Mirhoseini
|
1
|
+
PDF
Chat
|
Hardware Approximate Techniques for Deep Neural Network Accelerators: A Survey
|
2022
|
Giorgos Armeniakos
Georgios Zervakis
Dimitrios Soudris
Jörg Henkel
|
1
|
+
PDF
Chat
|
Monte Carlo Tree Search: a review of recent modifications and applications
|
2022
|
Maciej Świechowski
Konrad Godlewski
Bartosz Sawicki
Jacek Mańdziuk
|
1
|
+
|
Loss Aware Post-training Quantization
|
2019
|
Yury Nahshan
Brian Chmiel
Chaim Baskin
Evgenii Zheltonozhskii
Ron Banner
Alex Bronstein
Avi Mendelson
|
1
|
+
|
Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning
|
2022
|
Elias Frantar
Dan Alistarh
|
1
|
+
|
Distributed Representations of Words and Phrases and their Compositionality
|
2013
|
Tomáš Mikolov
Ilya Sutskever
Kai Chen
Greg S. Corrado
Jeffrey Dean
|
1
|
+
|
PyTorch: An Imperative Style, High-Performance Deep Learning Library
|
2019
|
Adam Paszke
Sam Gross
Francisco Massa
Adam Lerer
James T. Bradbury
Gregory Chanan
Trevor Killeen
Zeming Lin
Natalia Gimelshein
Luca Antiga
|
1
|
+
PDF
Chat
|
Real-Time Scheduling of Machine Learning Operations on Heterogeneous Neuromorphic SoC
|
2022
|
Anup Das
|
1
|
+
|
A Survey and Empirical Evaluation of Parallel Deep Learning Frameworks
|
2021
|
Daniel Nichols
Siddharth Singh
Shu-Huai Lin
Abhinav Bhatelé
|
1
|
+
PDF
Chat
|
Designing Energy-Efficient Convolutional Neural Networks Using Energy-Aware Pruning
|
2017
|
Tien-Ju Yang
Yu‐Hsin Chen
Vivienne Sze
|
1
|
+
PDF
Chat
|
Rainbow: Combining Improvements in Deep Reinforcement Learning
|
2018
|
Matteo Hessel
Joseph Modayil
Hado van Hasselt
Tom Schaul
Georg Ostrovski
Will Dabney
Dan Horgan
Bilal Piot
Mohammad Gheshlaghi Azar
David Silver
|
1
|
+
PDF
Chat
|
AMC: AutoML for Model Compression and Acceleration on Mobile Devices
|
2018
|
Yihui He
Ji Lin
Zhijian Liu
Hanrui Wang
Li-Jia Li
Song Han
|
1
|
+
|
Gaussian Error Linear Units (GELUs)
|
2016
|
Dan Hendrycks
Kevin Gimpel
|
1
|
+
PDF
Chat
|
High-Throughput CNN Inference on Embedded ARM Big.LITTLE Multicore Processors
|
2019
|
Siqi Wang
Gayathri Ananthanarayanan
Yifan Zeng
Neeraj Goel
Anuj Pathania
Tulika Mitra
|
1
|
+
PDF
Chat
|
NetAdapt: Platform-Aware Neural Network Adaptation for Mobile Applications
|
2018
|
Tien-Ju Yang
Andrew Howard
Bo Chen
Xiao Zhang
Alec Go
Mark Sandler
Vivienne Sze
Hartwig Adam
|
1
|
+
PDF
Chat
|
Channel Pruning for Accelerating Very Deep Neural Networks
|
2017
|
Yihui He
Xiangyu Zhang
Jian Sun
|
1
|
+
PDF
Chat
|
Learning Transferable Architectures for Scalable Image Recognition
|
2018
|
Barret Zoph
Vijay Vasudevan
Jonathon Shlens
Quoc V. Le
|
1
|
+
PDF
Chat
|
Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning
|
2017
|
Christian Szegedy
Sergey Ioffe
Vincent Vanhoucke
Alexander A. Alemi
|
1
|
+
PDF
Chat
|
HAQ: Hardware-Aware Automated Quantization With Mixed Precision
|
2019
|
Kuan Wang
Zhijian Liu
Yujun Lin
Ji Lin
Song Han
|
1
|
+
|
Loss Aware Post-training Quantization
|
2019
|
Yury Nahshan
Brian Chmiel
Chaim Baskin
Evgenii Zheltonozhskii
Ron Banner
Alex Bronstein
Avi Mendelson
|
1
|
+
|
Bayesian Bits: Unifying Quantization and Pruning
|
2020
|
Mart van Baalen
Christos Louizos
Markus Nagel
Rana Ali Amjad
Ying Wang
Tijmen Blankevoort
Max Welling
|
1
|
+
PDF
Chat
|
Structured Compression by Weight Encryption for Unstructured Pruning and Quantization
|
2020
|
Se Jung Kwon
Dongsoo Lee
Byeongwook Kim
Parichay Kapoor
Baeseong Park
Gu-Yeon Wei
|
1
|
+
PDF
Chat
|
Automatic Neural Network Compression by Sparsity-Quantization Joint Learning: A Constrained Optimization-Based Approach
|
2020
|
Haichuan Yang
Shupeng Gui
Yuhao Zhu
Ji Liu
|
1
|