Ask a Question

Prefer a chat interface with context about you and your work?

Reconstruct and Represent Video Contents for Captioning via Reinforcement Learning

Reconstruct and Represent Video Contents for Captioning via Reinforcement Learning

In this paper, the problem of describing visual contents of a video sequence with natural language is addressed. Unlike previous video captioning work mainly exploiting the cues of video contents to make a language description, we propose a reconstruction network (RecNet) in a novel encoder-decoder-reconstructor architecture, which leverages both forward …