Ask a Question

Prefer a chat interface with context about you and your work?

Lateralization LoRA: Interleaved Instruction Tuning with Modality-Specialized Adaptations

Lateralization LoRA: Interleaved Instruction Tuning with Modality-Specialized Adaptations

Recent advancements in Vision-Language Models (VLMs) have led to the development of Vision-Language Generalists (VLGs) capable of understanding and generating interleaved images and text. Despite these advances, VLGs still struggle to follow user instructions for interleaved text and image generation. To address this issue, we introduce LeafInstruct, the first open-sourced …