Ask a Question

Prefer a chat interface with context about you and your work?

MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis

MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis

Auto-regressive models have made significant progress in the realm of language generation, yet they do not perform on par with diffusion models in the domain of image synthesis. In this work, we introduce MARS, a novel framework for T2I generation that incorporates a specially designed Semantic Vision-Language Integration Expert (SemVIE). …