Ask a Question

Prefer a chat interface with context about you and your work?

Training-free Regional Prompting for Diffusion Transformers

Training-free Regional Prompting for Diffusion Transformers

Diffusion models have demonstrated excellent capabilities in text-to-image generation. Their semantic understanding (i.e., prompt following) ability has also been greatly improved with large language models (e.g., T5, Llama). However, existing models cannot perfectly handle long and complex text prompts, especially when the text prompts contain various objects with numerous attributes …