SmartMem: Layout Transformation Elimination and Adaptation for Efficient DNN Execution on Mobile
SmartMem: Layout Transformation Elimination and Adaptation for Efficient DNN Execution on Mobile
This work is motivated by recent developments in Deep Neural Networks, particularly the Transformer architectures underlying applications such as ChatGPT, and the need for performing inference on mobile devices. Focusing on emerging transformers (specifically the ones with computationally efficient Swin-like architectures) and large models (e.g., Stable Diffusion and LLMs) based …