LMSeg: Unleashing the Power of Large-Scale Models for Open-Vocabulary
Semantic Segmentation
LMSeg: Unleashing the Power of Large-Scale Models for Open-Vocabulary
Semantic Segmentation
It is widely agreed that open-vocabulary-based approaches outperform classical closed-set training solutions for recognizing unseen objects in images for semantic segmentation. Existing open-vocabulary approaches leverage vision-language models, such as CLIP, to align visual features with rich semantic features acquired through pre-training on large-scale vision-language datasets. However, the text prompts employed …