Training Noise Token Pruning

Type: Preprint

Publication Date: 2024-11-27

Citations: 0

DOI: https://doi.org/10.48550/arxiv.2411.18092

View Chat PDF

Abstract

In the present work we present Training Noise Token (TNT) Pruning for vision transformers. Our method relaxes the discrete token dropping condition to continuous additive noise, providing smooth optimization in training, while retaining discrete dropping computational gains in deployment settings. We provide theoretical connections to Rate-Distortion literature, and empirical evaluations on the ImageNet dataset using ViT and DeiT architectures demonstrating TNT's advantages over previous pruning methods.

Locations

  • arXiv (Cornell University) - View - PDF

Similar Works

Action Title Year Authors
+ PDF Chat DiffRate : Differentiable Compression Rate for Efficient Vision Transformers 2023 Mengzhao Chen
Wenqi Shao
Peng Xu
Mingbao Lin
Kaipeng Zhang
Fei Chao
Rongrong Ji
Yu Qiao
Ping Luo
+ TPC-ViT: Token Propagation Controller for Efficient Vision Transformer 2024 Wentao Zhu
+ Adaptive Sparse ViT: Towards Learnable Adaptive Token Pruning by Fully Exploiting Self-Attention 2022 Xiangcheng Liu
Tianyi Wu
Guodong Guo
+ Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models 2023 Ajay Jaiswal
Shiwei Liu
Tianlong Chen
Ying Ding
Zhangyang Wang
+ Zero-TPrune: Zero-Shot Token Pruning through Leveraging of the Attention Graph in Pre-Trained Transformers 2023 Hongjie Wang
Bhishma Dedhia
Niraj K. Jha
+ Joint Token Pruning and Squeezing Towards More Aggressive Compression of Vision Transformers 2023 Siyuan Wei
Tianzhu Ye
Shen Zhang
Yao Tang
Jiajun Liang
+ PDF Chat Token Compensator: Altering Inference Cost of Vision Transformer without Re-Tuning 2024 Shibo Jie
Yehui Tang
Jianyuan Guo
Zhiā€Hong Deng
Kai Han
Yunhe Wang
+ Token Pooling in Vision Transformers 2021 Dmitrii Marin
Jen-Hao Rick Chang
Anurag Ranjan
Anish Prabhu
Mohammad Rastegari
Oncel Tuzel
+ PDF Chat NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers 2023 Yijiang Liu
Huanrui Yang
Zhen Dong
Kurt Keutzer
Li Du
Shanghang Zhang
+ NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers 2022 Yijiang Liu
Huanrui Yang
Zhen Dong
Kurt Keutzer
Li Du
Shanghang Zhang
+ PDF Chat Token Statistics Transformer: Linear-Time Attention via Variational Rate Reduction 2024 Ziyang Wu
Tianjiao Ding
Yifu Lu
Druv Pai
Jingyuan Zhang
Weida Wang
Yaodong Yu
Yi Ma
Benjamin D. Haeffele
+ PDF Chat A Survey on Deep Neural Network Pruning: Taxonomy, Comparison, Analysis, and Recommendations 2024 Hongrong Cheng
Miao Zhang
Qinfeng Shi
+ Learned Threshold Pruning 2020 Kambiz Azarian
Yash Bhalgat
Jinwon Lee
Tijmen Blankevoort
+ PDF Chat Preserving Deep Representations In One-Shot Pruning: A Hessian-Free Second-Order Optimization Framework 2024 Ryan Lucas
Rahul Mazumder
+ Bi-ViT: Pushing the Limit of Vision Transformer Quantization 2023 Yanjing Li
Sheng Xu
Mingbao Lin
Xianbin Cao
Chuanjian Liu
Xiao Sun
Baochang Zhang
+ PDF Chat Bi-ViT: Pushing the Limit of Vision Transformer Quantization 2024 Yanjing Li
Sheng Xu
Mingbao Lin
Xianbin Cao
Chuanjian Liu
Xiao Sun
Baochang Zhang
+ PDF Chat What Makes for Good Tokenizers in Vision Transformer? 2022 Shengju Qian
Yi Zhu
Wenbo Li
Mu Li
Jiaya Jia
+ Token Pooling in Vision Transformers. 2021 Dmitrii Marin
Jen-Hao Rick Chang
Anurag Ranjan
Anish Prabhu
Mohammad Rastegari
Oncel Tuzel
+ Dynamic Token Normalization Improves Vision Transformers 2021 Wenqi Shao
Yixiao Ge
Zhaoyang Zhang
Xuyuan Xu
Xiaogang Wang
Ying Shan
Ping Luo
+ PDF Chat Dynamic Token Normalization Improves Vision Transformer 2021 Wenqi Shao
Yixiao Ge
Zhaoyang Zhang
Xuyuan Xu
Xiaogang Wang
Ying Shan
Ping Luo

Cited by (0)

Action Title Year Authors

Citing (0)

Action Title Year Authors