Ask a Question

Prefer a chat interface with context about you and your work?

GlobalMamba: Global Image Serialization for Vision Mamba

GlobalMamba: Global Image Serialization for Vision Mamba

Vision mambas have demonstrated strong performance with linear complexity to the number of vision tokens. Their efficiency results from processing image tokens sequentially. However, most existing methods employ patch-based image tokenization and then flatten them into 1D sequences for causal processing, which ignore the intrinsic 2D structural correlations of images. …