Ask a Question

Prefer a chat interface with context about you and your work?

Zero-shot Prompt-based Video Encoder for Surgical Gesture Recognition

Zero-shot Prompt-based Video Encoder for Surgical Gesture Recognition

Purpose: Surgical video is an important data stream for gesture recognition. Thus, robust visual encoders for those data-streams is similarly important. Methods: Leveraging the Bridge-Prompt framework, we fine-tune a pre-trained vision-text model (CLIP) for gesture recognition in surgical videos. This can utilize extensive outside video data such as text, but …