Pure tensor program rewriting via access patterns (representation pearl)

Gus Henry Smith, Andrew Liu, Steven Lyubomirsky, Scott Davidson, Joseph McMahan, Michael Taylor, Luís Ceze, Zachary Tatlock

Type: Preprint

Publication Date: 2021-06-18

Citations: 16

DOI: https://doi.org/10.1145/3460945.3464953

Download PDF

Abstract

Tensor kernels in machine learning (ML) often correspond to pure mathematical expressions, making term rewriting an attractive strategy for optimization and mapping to specialized hardware accelerators. However, existing ML intermediate representations (IRs) tend to either be pure but high-level, making low-level rewrites to hardware targets inexpressible, or low-level but impure, hampering the use of term rewriting altogether.

Locations

arXiv (Cornell University) - View - PDF
DataCite API - View

Similar Works

Action	Title	Year	Authors
+ PDF Chat	Matching Linear Algebra and Tensor Code to Specialized Hardware Accelerators	2023	Pablo A. Lanzarote Martínez Jackson Woodruff Jordi Armengol-Estapé Gregorio Bernabé José M. Garcı́a Michael O’Boyle
+ PDF Chat	Optimizing Tensor Computation Graphs with Equality Saturation and Monte Carlo Tree Search	2024	Jakob Hartmann Guoliang He Eiko Yoneki
+ PDF Chat	A Multi-Level Superoptimizer for Tensor Programs	2024	Mengdi Wu Xinhao Cheng Oded Padon Zhihao Jia
+	OLLIE: Derivation-based Tensor Program Optimizer	2022	Zheng Liyan Haojie Wang Jidong Zhai Muyan Hu Zixuan Ma Tuowei Wang Shizhi Tang Lei Xie Kezhao Huang Zhihao Jia
+ PDF Chat	TF-Coder: Program Synthesis for Tensor Manipulations	2022	Kensen Shi David Bieber Rishabh Singh
+	TF-Coder: Program Synthesis for Tensor Manipulations	2020	Kensen Shi David Bieber Rishabh Singh
+	TF-Coder: Program Synthesis for Tensor Manipulations	2020	Kensen Shi David Bieber Rishabh Singh
+	Polyhedral Tensor Schedulers	2019	Benoît Meister Eric Papenhausen Akai Kaeru Benoit Pradelle Silexica
+ PDF Chat	TapirXLA: Embedding Fork-Join Parallelism into the XLA Compiler in TensorFlow Using Tapir	2019	Tao B. Schardl Siddharth Samsi
+	Generating Efficient Programs for Two-Level Memories from Tensor-products.	1995	Sandeep K. S. Gupta Zhiyong Li John H. Reif
+ PDF Chat	Optimal Kernel Orchestration for Tensor Programs with Korch	2024	Muyan Hu Ashwin Venkatram S. Biswas Balamurugan Marimuthu Bohan Hou G. Oliaro Haojie Wang Liyan Zheng Xupeng Miao Jidong Zhai
+	Predictive data locality optimization for higher-order tensor computations	2021	Tharindu R. Patabandi Anand Venkat Abhishek Kulkarni Pushkar Ratnalikar Mary Hall Justin Gottschlich
+ PDF Chat	Tensor Evolution: A framework for Fast Evaluation of Tensor Computations using Recurrences	2025	Javed Absar Samarth Narang Muthu Manikandan Baskaran
+	Instead of Rewriting Foreign Code for Machine Learning, Automatically Synthesize Fast Gradients	2020	William S. Moses Valentin Churavy
+	Instead of Rewriting Foreign Code for Machine Learning, Automatically Synthesize Fast Gradients	2020	William S. Moses Valentin Churavy
+	LoopStack: a Lightweight Tensor Algebra Compiler Stack	2022	Bram Wasti José Cambronero Benoit Steiner Hugh Leather Aleksandar Zlateski
+	Predictive Synthesis of API-Centric Code	2022	Daye Nam Baishakhi Ray Seohyun Kim Xianshan Qu Satish Chandra
+ PDF Chat	Pruner: An Efficient Cross-Platform Tensor Compiler with Dual Awareness	2024	Liang Qiao Jun Shi Xiaoyu Hao Xi Fang Minfan Zhao Ziqi Zhu Junshi Chen Hong An Bing Li Honghui Yuan
+	TPU-MLIR: A Compiler For TPU Using MLIR	2022	Pengchao Hu Man Lu Lei Wang Guoyue Jiang
+	A Holistic Functionalization Approach to Optimizing Imperative Tensor Programs in Deep Learning	2024	Jinming Ma Xiuhong Li Zihan Wang Xingcheng Zhang Shengen Yan Yuting Chen Yueqian Zhang Minxi Jin Lijuan Jiang Yun Liang

Works That Cite This (8)

Action	Title	Year	Authors
+ PDF Chat	SmartMem: Layout Transformation Elimination and Adaptation for Efficient DNN Execution on Mobile	2024	Wei Niu Md Musfiqur Rahman Sanim Zhihao Shu Jiexiong Guan Xipeng Shen Miao Yin Gagan Agrawal Bin Ren
+	Combining E-Graphs with Abstract Interpretation	2023	Samuel Coward George A. Constantinides Theo Drane
+ PDF Chat	Automatic Datapath Optimization using E-Graphs	2022	Samuel Coward George A. Constantinides Theo Drane
+	DyCL: Dynamic Neural Network Compilation Via Program Rewriting and Graph Optimization	2023	Simin Chen Shiyi Wei Cong Liu Wei Yang
+	Caviar: An E-graph Based TRS for Automatic Code Optimization	2021	Smail Kourta Adel Namani Fatima Benbouzid-Si Tayeb Kim Hazelwood Chris Cummins Hugh Leather Riyadh Baghdadi
+	Sketch-Guided Equality Saturation: Scaling Equality Saturation to Complex Optimizations in Languages with Bindings	2021	Thomas Kœhler Phil Trinder Michel Steuwer
+	Rewriting History: Repurposing Domain-Specific CGRAs	2023	Jackson Woodruff Alexander Brauckmann Sam Ainsworth Thomas Kœhler Chris Cummins Michael O’Boyle
+	Application-level Validation of Accelerator Designs Using a Formal Software/Hardware Interface	2023	Bo-Yuan Huang Steven Lyubomirsky Yi Li Mike He Gus Henry Smith Thierry Tambe Akash Gaonkar Vishal Canumalla A H Cheung Gu-Yeon Wei

Works Cited by This (33)

Action	Title	Year	Authors
+ PDF Chat	The tensor algebra compiler	2017	Fredrik Kjølstad Shoaib Kamil Stephen Y. Chou David Lugato Saman Amarasinghe
+	In-Datacenter Performance Analysis of a Tensor Processing Unit	2017	Norman P. Jouppi Cliff Young Nishant Patil David A. Patterson Gaurav Agrawal Raminder Bajwa S. C. Bates Suresh Bhatia Nan Boden Al Borchers
+	Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions	2018	Nicolas Vasilache Oleksandr Zinenko Theodoros Theodoridis Priya Goyal Zachary DeVito William S. Moses Sven Verdoolaege Andrew Adams Albert Cohen
+	Learning to Optimize Tensor Programs	2018	Tianqi Chen Lianmin Zheng Eddie Yan Ziheng Jiang Thierry Moreau Luís Ceze Carlos Guestrin Arvind Krishnamurthy
+	Tiramisu: a polyhedral compiler for expressing fast and portable code	2019	Riyadh Baghdadi Jessica Ray Malek Ben Romdhane Emanuele Del Sozzo Abdurrahman Akkas Yunming Zhang Patricia Suriana Shoaib Kamil Saman Amarasinghe
+	Relay: A High-Level IR for Deep Learning.	2019	Jared Roesch Steven Lyubomirsky Marisa Kirisame Josh Pollock Logan Weber Ziheng Jiang Tianqi Chen Thierry Moreau Zachary Tatlock
+ PDF Chat	Tiramisu: A Polyhedral Compiler for Expressing Fast and Portable Code	2019	Riyadh Baghdadi Jessica Ray Malek Ben Romdhane Emanuele Del Sozzo Abdurrahman Akkas Yunming Zhang Patricia Suriana Shoaib Kamil Saman Amarasinghe
+ PDF Chat	In-Datacenter Performance Analysis of a Tensor Processing Unit	2017	Norman P. Jouppi Cliff Young Nishant Patil David A. Patterson Gaurav Agrawal Raminder Bajwa S. C. Bates Suresh Bhatia Nan Boden Al Borchers
+	TensorFlow: A system for large-scale machine learning	2016	Martı́n Abadi Paul Barham Jianmin Chen Zhifeng Chen Andy Davis Jay B. Dean Matthieu Devin Sanjay Ghemawat Geoffrey Irving Michael Isard
+ PDF Chat	A Hardware–Software Blueprint for Flexible Deep Learning Specialization	2019	Thierry Moreau Tianqi Chen Luis Vega Jared Roesch Eddie Yan Lianmin Zheng Josh Fromm Ziheng Jiang Luís Ceze Carlos Guestrin