Ask a Question

Prefer a chat interface with context about you and your work?

ActionCOMET: A Zero-shot Approach to Learn Image-specific Commonsense Concepts about Actions

ActionCOMET: A Zero-shot Approach to Learn Image-specific Commonsense Concepts about Actions

Humans observe various actions being performed by other humans (physically or in videos/images) and can draw a wide range of inferences about it beyond what they can visually perceive. Such inferences include determining the aspects of the world that make action execution possible (e.g. liquid objects can undergo pouring), predicting …