Ask a Question

Prefer a chat interface with context about you and your work?

Weakly Supervised Temporal Adjacent Network for Language Grounding

Weakly Supervised Temporal Adjacent Network for Language Grounding

Temporal language grounding (TLG) is a fundamental and challenging problem for vision and language understanding. Existing methods mainly focus on fully supervised setting with temporal boundary labels for training, which, however, suffers expensive cost of annotation. In this work, we are dedicated to weakly supervised TLG, where multiple description sentences …