Residential College | false |
Status | 即將出版Forthcoming |
StyleAdapter: A Unified Stylized Image Generation Model | |
Wang, Zhouxia1; Wang, Xintao2; Xie, Liangbin3; Qi, Zhongang2; Shan, Ying2; Wang, Wenping1; Luo, Ping4 | |
2024-10-25 | |
Source Publication | International Journal of Computer Vision |
ISSN | 0920-5691 |
Abstract | This work focuses on generating high-quality images with specific style of reference images and content of provided textual descriptions. Current leading algorithms, i.e., DreamBooth and LoRA, require fine-tuning for each style, leading to time-consuming and computationally expensive processes. In this work, we propose StyleAdapter, a unified stylized image generation model capable of producing a variety of stylized images that match both the content of a given prompt and the style of reference images, without the need for per-style fine-tuning. It introduces a two-path cross-attention (TPCA) module to separately process style information and textual prompt, which cooperate with a semantic suppressing vision model (SSVM) to suppress the semantic content of style images. In this way, it can ensure that the prompt maintains control over the content of the generated images, while also mitigating the negative impact of semantic information in style references. This results in the content of the generated image adhering to the prompt, and its style aligning with the style references. Besides, our StyleAdapter can be integrated with existing controllable synthesis methods, such as T2I-adapter and ControlNet, to attain a more controllable and stable generation process. Extensive experiments demonstrate the superiority of our method over previous works. |
Keyword | Stylized Image Generation Artificial Intelligence Generated Content (Aigc) Diffusion Model Computer Vision |
DOI | 10.1007/s11263-024-02253-x |
URL | View the original |
Indexed By | SCIE |
Language | 英語English |
WOS Research Area | Computer Science |
WOS Subject | Computer Science, Artificial Intelligence |
WOS ID | WOS:001341170100002 |
Publisher | SPRINGERVAN GODEWIJCKSTRAAT 30, 3311 GZ DORDRECHT, NETHERLANDS |
Scopus ID | 2-s2.0-85207351072 |
Fulltext Access | |
Citation statistics | |
Document Type | Journal article |
Collection | University of Macau |
Corresponding Author | Wang, Xintao; Luo, Ping |
Affiliation | 1.The University of Hong Kong, Hong Kong, SAR, China 2.ARC Lab, Tencent PCG, Shenzhen, China 3.The University of Macau, Macau SAR, China and Shenzhen Institute of Advanced Technology, Shenzhen, China 4.The University of Hong Kong, Hong Kong SAR, China and Shanghai AI Laboratory, Shanghai, China |
Recommended Citation GB/T 7714 | Wang, Zhouxia,Wang, Xintao,Xie, Liangbin,et al. StyleAdapter: A Unified Stylized Image Generation Model[J]. International Journal of Computer Vision, 2024. |
APA | Wang, Zhouxia., Wang, Xintao., Xie, Liangbin., Qi, Zhongang., Shan, Ying., Wang, Wenping., & Luo, Ping (2024). StyleAdapter: A Unified Stylized Image Generation Model. International Journal of Computer Vision. |
MLA | Wang, Zhouxia,et al."StyleAdapter: A Unified Stylized Image Generation Model".International Journal of Computer Vision (2024). |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment