Ask a Question

Prefer a chat interface with context about you and your work?

LMSeg: Unleashing the Power of Large-Scale Models for Open-Vocabulary Semantic Segmentation

LMSeg: Unleashing the Power of Large-Scale Models for Open-Vocabulary Semantic Segmentation

It is widely agreed that open-vocabulary-based approaches outperform classical closed-set training solutions for recognizing unseen objects in images for semantic segmentation. Existing open-vocabulary approaches leverage vision-language models, such as CLIP, to align visual features with rich semantic features acquired through pre-training on large-scale vision-language datasets. However, the text prompts employed …