TokenVerse: Multi-Concept Personalization in Text-to-Image Diffusion Models
2025-01-28

TokenVerse introduces a novel method for multi-concept personalization leveraging a pre-trained text-to-image diffusion model. It disentangles complex visual elements and attributes from a single image, enabling seamless generation of combinations of concepts extracted from multiple images. Unlike existing methods limited in concept type or breadth, TokenVerse handles multiple images with multiple concepts each, supporting objects, accessories, materials, pose, and lighting. By optimizing for distinct directions in the model's modulation space for each word, it generates images combining desired concepts. Experiments demonstrate its effectiveness in challenging personalization settings.