Ask a Question

Prefer a chat interface with context about you and your work?

MAttNet: Modular Attention Network for Referring Expression Comprehension

MAttNet: Modular Attention Network for Referring Expression Comprehension

In this paper, we address referring expression comprehension: localizing an image region described by a natural language expression. While most recent work treats expressions as a single unit, we propose to decompose them into three modular components related to subject appearance, location, and relationship to other objects. This allows us …