Ask a Question

Prefer a chat interface with context about you and your work?

Personalized Visual Instruction Tuning

Personalized Visual Instruction Tuning

Recent advancements in multimodal large language models (MLLMs) have demonstrated significant progress; however, these models exhibit a notable limitation, which we refer to as "face blindness". Specifically, they can engage in general conversations but fail to conduct personalized dialogues targeting at specific individuals. This deficiency hinders the application of MLLMs …