Ask a Question

Prefer a chat interface with context about you and your work?

In-Context Ensemble Improves Video-Language Models for Low-Level Workflow Understanding from Human Demonstrations

In-Context Ensemble Improves Video-Language Models for Low-Level Workflow Understanding from Human Demonstrations

A Standard Operating Procedure (SOP) defines a low-level, step-by-step written guide for a business software workflow based on a video demonstration. SOPs are a crucial step toward automating end-to-end software workflows. Manually creating SOPs can be time-consuming. Recent advancements in large video-language models offer the potential for automating SOP generation …