A Hardware-Software Blueprint for Flexible Deep Learning Specialization

Type: Preprint

Publication Date: 2018-01-01

Citations: 21

DOI: https://doi.org/10.48550/arxiv.1807.04188

Locations

  • arXiv (Cornell University) - View - PDF
  • DataCite API - View

Similar Works

Action Title Year Authors
+ A Hardware-Software Blueprint for Flexible Deep Learning Specialization 2018 Thierry Moreau
Tianqi Chen
Luis Vega
Jared Roesch
Eddie Yan
Lianmin Zheng
Josh Fromm
Ziheng Jiang
LuĂ­s Ceze
Carlos Guestrin
+ VTA: An Open Hardware-Software Stack for Deep Learning. 2018 Thierry Moreau
Tianqi Chen
Ziheng Jiang
LuĂ­s Ceze
Carlos Guestrin
Arvind Krishnamurthy
+ PDF Chat A Hardware–Software Blueprint for Flexible Deep Learning Specialization 2019 Thierry Moreau
Tianqi Chen
Luis Vega
Jared Roesch
Eddie Yan
Lianmin Zheng
Josh Fromm
Ziheng Jiang
LuĂ­s Ceze
Carlos Guestrin
+ TVM: End-to-End Optimization Stack for Deep Learning 2018 Tianqi Chen
Thierry Moreau
Ziheng Jiang
Haichen Shen
Eddie Yan
Leyuan Wang
Yuwei Hu
LuĂ­s Ceze
Carlos Guestrin
Arvind Krishnamurthy
+ TVM: An Automated End-to-End Optimizing Compiler for Deep Learning 2018 Tianqi Chen
Thierry Moreau
Ziheng Jiang
Lianmin Zheng
Eddie Yan
Meghan Cowan
Haichen Shen
Leyuan Wang
Yuwei Hu
LuĂ­s Ceze
+ PDF Chat Tensor processing primitives: a programming abstraction for efficiency and portability in deep learning workloads 2021 Evangelos Georganas
Dhiraj D. Kalamkar
Sasikanth Avancha
Menachem Adelman
Cristina Anderson
Alexander Breuer
Jeremy Bruestle
Narendra Chaudhary
Abhisek Kundu
Denise Kutnick
+ Cost-Aware TVM (CAT) Tensorization for Modern Deep Learning Accelerators 2022 Yahang Hu
Yaohua Wang
Xiaoqiang Dan
Xiao Hu
Fei Liu
Jinpeng Li
+ Violet: Architecturally Exposed Orchestration, Movement, and Placement for Generalized Deep Learning 2021 Michael C. R. Davies
Adam Labiosa
Karthikeyan Sankaralingam
+ Hidet: Task-Mapping Programming Paradigm for Deep Learning Tensor Programs 2023 Yaoyao Ding
Cody Hao Yu
Bojian Zheng
Yizhi Liu
Yida Wang
Gennady Pekhimenko
+ Hidet: Task-Mapping Programming Paradigm for Deep Learning Tensor Programs 2022 Yaoyao Ding
Cody Hao Yu
Bo‐Jian Zheng
Yizhi Liu
Yida Wang
Gennady Pekhimenko
+ Enabling One-size-fits-all Compilation Optimization across Machine Learning Computers for Inference 2021 Y. Wen
Qi Guo
Zidong Du
Jianxing Xu
Zhenxing Zhang
Xing Hu
Wei Li
Rui Zhang
Chao Wang
Xuehai Zhou
+ PDF Chat Pruner: An Efficient Cross-Platform Tensor Compiler with Dual Awareness 2024 Liang Qiao
Jun Shi
Xiaoyu Hao
Xi Fang
Minfan Zhao
Ziqi Zhu
Junshi Chen
Hong An
Bing Li
Honghui Yuan
+ PDF Chat The Deep Learning Compiler: A Comprehensive Survey 2020 Mingzhen Li
Yi Liu
Xiaoyan Liu
Qingxiao Sun
Xin You
Hailong Yang
Zhongzhi Luan
Lin Gan
Guangwen Yang
Depei Qian
+ PDF Chat Field-Programmable Gate Array Architecture for Deep Learning: Survey & Future Directions 2024 Andrew Boutros
Aman Arora
Vaughn Betz
+ A Survey on Design Methodologies for Accelerating Deep Learning on Heterogeneous Architectures 2023 Fabrizio Ferrandi
Serena Curzel
Leandro Fiorin
Daniele Ielmini
Cristina Silvano
Francesco Conti
Alessio Burrello
Francesco Barchi
Luca Benini
Luciano Lavagno
+ PDF Chat Allo: A Programming Model for Composable Accelerator Design 2024 Hongzheng Chen
Niansong Zhang
Shaojie Xiang
Zhichen Zeng
Mengjia Dai
Zhiru Zhang
+ Intel nGraph: An Intermediate Representation, Compiler, and Executor for Deep Learning 2018 Scott Cyphers
Arjun K. Bansal
Anahita Bhiwandiwalla
Jayaram Bobba
Matthew Brookhart
Avijit Chakraborty
Will Constable
Christian Convey
Leona Cook
Omar Kanawi
+ PDF Chat Exploiting Parallelism Opportunities with Deep Learning Frameworks 2020 Yu Emma Wang
Carole-Jean Wu
Xiaodong Wang
Kim Hazelwood
David Brooks
+ RAF: Holistic Compilation for Deep Learning Model Training 2023 Cody Hao Yu
Haozheng Fan
Guangtai Huang
Zhen Jia
Yizhi Liu
Jie Wang
Zach Zheng
Yuan Zhou
Haichen Shen
Junru Shao
+ Exploiting Parallelism Opportunities with Deep Learning Frameworks 2019 Yu Emma Wang
Carole-Jean Wu
Xiaodong Wang
Kim Hazelwood
David J. Brooks

Works That Cite This (10)

Action Title Year Authors
+ A TinyML Platform for On-Device Continual Learning With Quantized Latent Replays 2021 Leonardo Ravaglia
Manuele Rusci
Davide Nadalini
Alessandro Capotondi
Francesco Conti
Luca Benini
+ PDF Chat Automatic Generation of Spatial Accelerator for Tensor Algebra 2022 Liancheng Jia
Zizhang Luo
Liqiang Lu
Yun Liang
+ Mobile-URSONet: an Embeddable Neural Network for Onboard Spacecraft Pose Estimation 2022 Julien Posso
Guy Bois
Yvon Savaria
+ PDF Chat SECDA: Efficient Hardware/Software Co-Design of FPGA-based DNN Accelerators for Edge Inference 2021 Jude Haris
Perry Gibson
JosĂŠ Cano
Nicolas Bohm Agostini
David Kaeli
+ Hardware Acceleration of Sparse and Irregular Tensor Computations of ML Models: A Survey and Insights 2021 Shail Dave
Riyadh Baghdadi
Tony Nowatzki
Sasikanth Avancha
Aviral Shrivastava
Baoxin Li
+ Quantune: Post-training quantization of convolutional neural networks using extreme gradient boosting for fast deployment 2022 Jemin Lee
Misun Yu
Yongin Kwon
Taeho Kim
+ PDF Chat A Gradient-Interleaved Scheduler for Energy-Efficient Backpropagation for Training Neural Networks 2020 Nanda K. Unnikrishnan
Keshab K. Parhi
+ Hardware-Aware Quantization and Performance Evaluation for Tensor Accelerator VTA 2023 Huazheng Zhao
Shanmin Pang
Yinghai Zhao
Haihong Lang
Yunxin He
Hengxiang Xu
Fan Gao
Kuizhi Mei
+ Harnessing FPGA Technology for Enhanced Biomedical Computation 2023 Nisanur Alici
+ Harnessing FPGA Technology for Enhanced Biomedical Computation 2023 Nisanur Alici

Works Cited by This (0)

Action Title Year Authors