Ask a Question

Prefer a chat interface with context about you and your work?

Set Prediction Guided by Semantic Concepts for Diverse Video Captioning

Set Prediction Guided by Semantic Concepts for Diverse Video Captioning

Diverse video captioning aims to generate a set of sentences to describe the given video in various aspects. Mainstream methods are trained with independent pairs of a video and a caption from its ground-truth set without exploiting the intra-set relationship, resulting in low diversity of generated captions. Different from them, …