TokenVerse: Versatile Multi-concept Personalization in Token Modulation Space

Daniel Garibi, Shahar Yadin, Roni Paiss, Omer Tov, Shiran Zada, Ariel Ephrat, Tomer Michaeli, Inbar Mosseri, Tali Dekel

Type: Preprint

Publication Date: 2025-01-21

Citations: 0

DOI: https://doi.org/10.48550/arxiv.2501.12224

Abstract

We present TokenVerse -- a method for multi-concept personalization, leveraging a pre-trained text-to-image diffusion model. Our framework can disentangle complex visual elements and attributes from as little as a single image, while enabling seamless plug-and-play generation of combinations of concepts extracted from multiple images. As opposed to existing works, TokenVerse can handle multiple images with multiple concepts each, and supports a wide-range of concepts, including objects, accessories, materials, pose, and lighting. Our work exploits a DiT-based text-to-image model, in which the input text affects the generation through both attention and modulation (shift and scale). We observe that the modulation space is semantic and enables localized control over complex concepts. Building on this insight, we devise an optimization-based framework that takes as input an image and a text description, and finds for each word a distinct direction in the modulation space. These directions can then be used to generate new images that combine the learned concepts in a desired configuration. We demonstrate the effectiveness of TokenVerse in challenging personalization settings, and showcase its advantages over existing methods. project's webpage in https://token-verse.github.io/

Locations

arXiv (Cornell University) - View - PDF

Similar Works

Action	Title	Year	Authors
+	Multi-Concept T2I-Zero: Tweaking Only The Text Embeddings and Nothing Else	2023	Hazarapet Tunanyan Dejia Xu Shant Navasardyan Shuicheng Yan Humphrey Shi
+	ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models	2023	Yuxin Zhang Weiming Dong Fan Tang Nisha Huang Haibin Huang Chongyang Ma Tong‐Yee Lee Oliver Deußen Changsheng Xu
+ PDF Chat	ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models	2023	Yuxin Zhang Weiming Dong Fan Tang Nisha Huang Haibin Huang Chongyang Ma Tong‐Yee Lee Oliver Deußen Changsheng Xu
+	CatVersion: Concatenating Embeddings for Diffusion-Based Text-to-Image Personalization	2023	Ruoyu Zhao Mingrui Zhu Shiyin Dong Nannan Wang Xinbo Gao
+ PDF Chat	A Neural Space-Time Representation for Text-to-Image Personalization	2023	Yuval Alaluf Elad Richardson Gal Metzer Daniel Cohen‐Or
+ PDF Chat	Learning to Customize Text-to-Image Diffusion In Diverse Context	2024	Taewook Kim Wei Chen Qiang Qiu
+ PDF Chat	DreamMatcher: Appearance Matching Self-Attention for Semantically-Consistent Text-to-Image Personalization	2024	Jisu Nam Heesu Kim DongJae Lee Siyoon Jin Seungryong Kim Seunggyu Chang
+ PDF Chat	Visual Concept-driven Image Generation with Text-to-Image Diffusion Model	2024	Tanzila Rahman Shweta Mahajan Hsin-Ying Lee Jian Ren Sergey Tulyakov Leonid Sigal
+ PDF Chat	AttenCraft: Attention-guided Disentanglement of Multiple Concepts for Text-to-Image Customization	2024	Junjie Shentu Matthew Watson Noura Al Moubayed
+	Key-Locked Rank One Editing for Text-to-Image Personalization	2023	Yoad Tewel Rinon Gal Gal Chechik Yuval Atzmon
+ PDF Chat	Textual Localization: Decomposing Multi-concept Images for Subject-Driven Text-to-Image Generation	2024	Junjie Shentu Matthew Watson Noura Al Moubayed
+	Key-Locked Rank One Editing for Text-to-Image Personalization	2023	Yoad Tewel Rinon Gal Gal Chechik Yuval Atzmon
+ PDF Chat	U-VAP: User-specified Visual Appearance Personalization via Decoupled Self Augmentation	2024	You Wu K. Liu Xiaoyue Mi Fan Tang Juan Cao Jintao Li
+ PDF Chat	Infusion: Preventing Customized Text-to-Image Diffusion from Overfitting	2024	Weili Zeng Yichao Yan Qi Zhu Zhuo Chen Pengzhi Chu Weiming Zhao Xiaokang Yang
+	InstantBooth: Personalized Text-to-Image Generation without Test-Time Finetuning	2023	Jing Shi Wei Xiong Zhe Lin Hyun Joon Jung
+ PDF Chat	MagicTailor: Component-Controllable Personalization in Text-to-Image Diffusion Models	2024	Donghao Zhou Jiancheng Huang Jinbin Bai Jiaze Wang Hao Chen Guangyong Chen Xiaowei Hu Pheng‐Ann Heng
+ PDF Chat	Concept Conductor: Orchestrating Multiple Personalized Concepts in Text-to-Image Synthesis	2024	Zebin Yao Fangxiang Feng Ruifan Li Xiaojie Wang
+ PDF Chat	Pick-and-Draw: Training-free Semantic Guidance for Text-to-Image Personalization	2024	Henglei Lv Jiayu Xiao Liang Li Qingming Huang
+ PDF Chat	Nested Attention: Semantic-aware Attention Values for Concept Personalization	2025	Or Patashnik Rinon Gal Daniil Ostashev Sergey Tulyakov Kfir Aberman Daniel Cohen‐Or
+ PDF Chat	$\lambda$-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space	2024	Maitreya Patel Sangmin Jung Chitta Baral Yezhou Yang

Works That Cite This (0)

Action	Title	Year	Authors

Works Cited by This (0)

Action	Title	Year	Authors