Ask a Question

Prefer a chat interface with context about you and your work?

Prompt-aligned Gradient for Prompt Tuning

Prompt-aligned Gradient for Prompt Tuning

Thanks to the large pre-trained vision-language models (VLMs) like CLIP [37], we can craft a zero-shot classifier by discrete prompt design, e.g., the confidence score of an image being "[CLASS]" can be obtained by using the VLM provided similarity between the image and the prompt sentence "a photo of a …