Ask a Question

Prefer a chat interface with context about you and your work?

Text-Conditioned Sampling Framework for Text-to-Image Generation with Masked Generative Models

Text-Conditioned Sampling Framework for Text-to-Image Generation with Masked Generative Models

Token-based masked generative models are gaining popularity for their fast inference time with parallel decoding. While recent token-based approaches achieve competitive performance to diffusion-based models, their generation performance is still suboptimal as they sample multiple tokens simultaneously without considering the dependence among them. We empirically investigate this problem and propose …