Ask a Question

Prefer a chat interface with context about you and your work?

On the use of Vision-Language models for Visual Sentiment Analysis: a study on CLIP

On the use of Vision-Language models for Visual Sentiment Analysis: a study on CLIP

This work presents a study on how to exploit the CLIP embedding space to perform Visual Sentiment Analysis. We experiment with two architectures built on top of the CLIP embedding space, which we denote by CLIP-E. We train the CLIP-E models with WEBEmo, the largest publicly available and manually labeled …