Ask a Question

Prefer a chat interface with context about you and your work?

Context Disentangling and Prototype Inheriting for Robust Visual Grounding

Context Disentangling and Prototype Inheriting for Robust Visual Grounding

Visual grounding (VG) aims to locate a specific target in an image based on a given language query. The discriminative information from context is important for distinguishing the target from other objects, particularly for the targets that have the same category as others. However, most previous methods underestimate such information. …