Dynamic Spatial Sparsification for Efficient Vision Transformers and Convolutional Neural Networks
Dynamic Spatial Sparsification for Efficient Vision Transformers and Convolutional Neural Networks
In this paper, we present a new approach for model acceleration by exploiting spatial sparsity in visual data. We observe that the final prediction in vision Transformers is only based on a subset of the most informative regions, which is sufficient for accurate image recognition. Based on this observation, we …