Ask a Question

Prefer a chat interface with context about you and your work?

End-to-End Video Captioning With Multitask Reinforcement Learning

End-to-End Video Captioning With Multitask Reinforcement Learning

Although end-to-end (E2E) learning has led to impressive progress on a variety of visual understanding tasks, it is often impeded by hardware constraints (e.g., GPU memory) and is prone to overfitting. When it comes to video captioning, one of the most challenging benchmark tasks in computer vision, those limitations of …