VAST 1.0: A Unified Framework for Controllable and Consistent Video
Generation
VAST 1.0: A Unified Framework for Controllable and Consistent Video
Generation
Generating high-quality videos from textual descriptions poses challenges in maintaining temporal coherence and control over subject motion. We propose VAST (Video As Storyboard from Text), a two-stage framework to address these challenges and enable high-quality video generation. In the first stage, StoryForge transforms textual descriptions into detailed storyboards, capturing human …