FontDiffuser: A Diffusion-Based Approach to One-Shot Font Generation

2025-04-24

FontDiffuser is a novel diffusion-based method for one-shot font generation, framing font imitation as a noise-to-denoise process. Addressing limitations of existing methods with complex characters and large style variations, FontDiffuser introduces a Multi-scale Content Aggregation (MCA) block to effectively combine global and local content cues across scales, preserving intricate strokes. Furthermore, a Style Contrastive Refinement (SCR) module, a novel style representation learning structure, uses a style extractor to disentangle styles and supervises the diffusion model with a style contrastive loss. Extensive experiments demonstrate FontDiffuser's state-of-the-art performance, particularly excelling with complex characters and significant style changes.