Ask a Question

Prefer a chat interface with context about you and your work?

Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models

Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models

We present ODISE: Open-vocabulary DIffusion-based panoptic SEgmentation, which unifies pre-trained text-image diffusion and discriminative models to perform open-vocabulary panoptic segmentation. Text-to-image diffusion models have the remarkable ability to generate high-quality images with diverse open-vocabulary language descriptions. This demonstrates that their internal representation space is highly correlated with open concepts in …