Ask a Question

Prefer a chat interface with context about you and your work?

Object-Aware Aggregation With Bidirectional Temporal Graph for Video Captioning

Object-Aware Aggregation With Bidirectional Temporal Graph for Video Captioning

Video captioning aims to automatically generate natural language descriptions of video content, which has drawn a lot of attention recent years. Generating accurate and fine-grained captions needs to not only understand the global content of video, but also capture the detailed object information. Meanwhile, video representations have great impact on …