Ask a Question

Prefer a chat interface with context about you and your work?

Listen to Look: Action Recognition by Previewing Audio

Listen to Look: Action Recognition by Previewing Audio

In the face of the video data deluge, today's expensive clip-level classifiers are increasingly impractical. We propose a framework for efficient action recognition in untrimmed video that uses audio as a preview mechanism to eliminate both short-term and long-term visual redundancies. First, we devise an ImgAud2Vid framework that hallucinates clip-level …