MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image
Synthesis
MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image
Synthesis
Auto-regressive models have made significant progress in the realm of language generation, yet they do not perform on par with diffusion models in the domain of image synthesis. In this work, we introduce MARS, a novel framework for T2I generation that incorporates a specially designed Semantic Vision-Language Integration Expert (SemVIE). …