Dual Attention on Pyramid Feature Maps for Image Captioning
Dual Attention on Pyramid Feature Maps for Image Captioning
Generating natural sentences from images is a fundamental learning task for visual-semantic understanding in multimedia. In this paper, we propose to apply dual attention on pyramid image feature maps to fully explore the visual-semantic correlations and improve the quality of generated sentences. Specifically, with the full consideration of the contextual …