Ask a Question

Prefer a chat interface with context about you and your work?

Multi-Compound Transformer for Accurate Biomedical Image Segmentation

Multi-Compound Transformer for Accurate Biomedical Image Segmentation

The recent vision transformer(i.e.for image classification) learns non-local attentive interaction of different patch tokens. However, prior arts miss learning the cross-scale dependencies of different pixels, the semantic correspondence of different labels, and the consistency of the feature representations and semantic embeddings, which are critical for biomedical segmentation. In this paper, …