TokenVerse: Versatile Multi-concept Personalization in Token Modulation Space

Type: Preprint

Publication Date: 2025-01-21

Citations: 0

DOI: https://doi.org/10.48550/arxiv.2501.12224

Abstract

We present TokenVerse -- a method for multi-concept personalization, leveraging a pre-trained text-to-image diffusion model. Our framework can disentangle complex visual elements and attributes from as little as a single image, while enabling seamless plug-and-play generation of combinations of concepts extracted from multiple images. As opposed to existing works, TokenVerse can handle multiple images with multiple concepts each, and supports a wide-range of concepts, including objects, accessories, materials, pose, and lighting. Our work exploits a DiT-based text-to-image model, in which the input text affects the generation through both attention and modulation (shift and scale). We observe that the modulation space is semantic and enables localized control over complex concepts. Building on this insight, we devise an optimization-based framework that takes as input an image and a text description, and finds for each word a distinct direction in the modulation space. These directions can then be used to generate new images that combine the learned concepts in a desired configuration. We demonstrate the effectiveness of TokenVerse in challenging personalization settings, and showcase its advantages over existing methods. project's webpage in https://token-verse.github.io/

Locations

  • arXiv (Cornell University) - View - PDF

Similar Works

Action Title Year Authors
+ Multi-Concept T2I-Zero: Tweaking Only The Text Embeddings and Nothing Else 2023 Hazarapet Tunanyan
Dejia Xu
Shant Navasardyan
Shuicheng Yan
Humphrey Shi
+ ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models 2023 Yuxin Zhang
Weiming Dong
Fan Tang
Nisha Huang
Haibin Huang
Chongyang Ma
Tong‐Yee Lee
Oliver Deußen
Changsheng Xu
+ PDF Chat ProSpect: Prompt Spectrum for Attribute-Aware Personalization of Diffusion Models 2023 Yuxin Zhang
Weiming Dong
Fan Tang
Nisha Huang
Haibin Huang
Chongyang Ma
Tong‐Yee Lee
Oliver Deußen
Changsheng Xu
+ CatVersion: Concatenating Embeddings for Diffusion-Based Text-to-Image Personalization 2023 Ruoyu Zhao
Mingrui Zhu
Shiyin Dong
Nannan Wang
Xinbo Gao
+ PDF Chat A Neural Space-Time Representation for Text-to-Image Personalization 2023 Yuval Alaluf
Elad Richardson
Gal Metzer
Daniel Cohen‐Or
+ PDF Chat Learning to Customize Text-to-Image Diffusion In Diverse Context 2024 Taewook Kim
Wei Chen
Qiang Qiu
+ PDF Chat DreamMatcher: Appearance Matching Self-Attention for Semantically-Consistent Text-to-Image Personalization 2024 Jisu Nam
Heesu Kim
DongJae Lee
Siyoon Jin
Seungryong Kim
Seunggyu Chang
+ PDF Chat Visual Concept-driven Image Generation with Text-to-Image Diffusion Model 2024 Tanzila Rahman
Shweta Mahajan
Hsin-Ying Lee
Jian Ren
Sergey Tulyakov
Leonid Sigal
+ PDF Chat AttenCraft: Attention-guided Disentanglement of Multiple Concepts for Text-to-Image Customization 2024 Junjie Shentu
Matthew Watson
Noura Al Moubayed
+ Key-Locked Rank One Editing for Text-to-Image Personalization 2023 Yoad Tewel
Rinon Gal
Gal Chechik
Yuval Atzmon
+ PDF Chat Textual Localization: Decomposing Multi-concept Images for Subject-Driven Text-to-Image Generation 2024 Junjie Shentu
Matthew Watson
Noura Al Moubayed
+ Key-Locked Rank One Editing for Text-to-Image Personalization 2023 Yoad Tewel
Rinon Gal
Gal Chechik
Yuval Atzmon
+ PDF Chat U-VAP: User-specified Visual Appearance Personalization via Decoupled Self Augmentation 2024 You Wu
K. Liu
Xiaoyue Mi
Fan Tang
Juan Cao
Jintao Li
+ PDF Chat Infusion: Preventing Customized Text-to-Image Diffusion from Overfitting 2024 Weili Zeng
Yichao Yan
Qi Zhu
Zhuo Chen
Pengzhi Chu
Weiming Zhao
Xiaokang Yang
+ InstantBooth: Personalized Text-to-Image Generation without Test-Time Finetuning 2023 Jing Shi
Wei Xiong
Zhe Lin
Hyun Joon Jung
+ PDF Chat MagicTailor: Component-Controllable Personalization in Text-to-Image Diffusion Models 2024 Donghao Zhou
Jiancheng Huang
Jinbin Bai
Jiaze Wang
Hao Chen
Guangyong Chen
Xiaowei Hu
Pheng‐Ann Heng
+ PDF Chat Concept Conductor: Orchestrating Multiple Personalized Concepts in Text-to-Image Synthesis 2024 Zebin Yao
Fangxiang Feng
Ruifan Li
Xiaojie Wang
+ PDF Chat Pick-and-Draw: Training-free Semantic Guidance for Text-to-Image Personalization 2024 Henglei Lv
Jiayu Xiao
Liang Li
Qingming Huang
+ PDF Chat Nested Attention: Semantic-aware Attention Values for Concept Personalization 2025 Or Patashnik
Rinon Gal
Daniil Ostashev
Sergey Tulyakov
Kfir Aberman
Daniel Cohen‐Or
+ PDF Chat $\lambda$-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space 2024 Maitreya Patel
Sangmin Jung
Chitta Baral
Yezhou Yang

Works That Cite This (0)

Action Title Year Authors

Works Cited by This (0)

Action Title Year Authors