Object Relational Graph With Teacher-Recommended Learning for Video Captioning
Object Relational Graph With Teacher-Recommended Learning for Video Captioning
Taking full advantage of the information from both vision and language is critical for the video captioning task. Existing models lack adequate visual representation due to the neglect of interaction between object, and sufficient training for content-related words due to long-tailed problems. In this paper, we propose a complete video …