UM

Browse/Search Results:  1-3 of 3 Help

Selected(0)Clear Items/Page:    Sort:
SmartEdit: Exploring Complex Instruction-Based Image Editing with Multimodal Large Language Models Conference paper
Huang, Yuzhou, Xie, Liangbin, Wang, Xintao, Yuan, Ziyang, Cun, Xiaodong, Ge, Yixiao, Zhou, Jiantao, Dong, Chao, Huang, Rui, Zhang, Ruimao, Shan, Ying. SmartEdit: Exploring Complex Instruction-Based Image Editing with Multimodal Large Language Models[C]:IEEE Computer Society, 2024, 8362-8371.
Authors:  Huang, Yuzhou;  Xie, Liangbin;  Wang, Xintao;  Yuan, Ziyang;  Cun, Xiaodong; et al.
Favorite | TC[WOS]:0 TC[Scopus]:1 | Submit date:2024/11/05
Training  Visualization  Computer Vision  Large Language Models  Diffusion Models  Cognition  Pattern Recognition  Instruction-based Image Editing  Multimodal Large Language Models  
COMMA: Co-articulated Multi-Modal Learning Conference paper
Hu, Lianyu, Gao, Liqing, Liu, Zekang, Pun, Chi Man, Feng, Wei. COMMA: Co-articulated Multi-Modal Learning[C]:Association for the Advancement of Artificial Intelligence, 2024, 2238-2246.
Authors:  Hu, Lianyu;  Gao, Liqing;  Liu, Zekang;  Pun, Chi Man;  Feng, Wei
Favorite | TC[Scopus]:0 | Submit date:2024/05/16
Cv: Language And Vision  Cv: Large Vision Models  Cv: Multi-modal Vision  Cv: Video Understanding & Activity Analysis  
The Neglected Tails in Vision-Language Models Conference paper
Parashar, Shubham, Lin, Zhiqiu, Liu, Tian, Dong, Xiangjue, Li, Yanan, Ramanan, Deva, Caverlee, James, Kong, Shu. The Neglected Tails in Vision-Language Models[C]:IEEE Computer Society, 2024, 12988-12997.
Authors:  Parashar, Shubham;  Lin, Zhiqiu;  Liu, Tian;  Dong, Xiangjue;  Li, Yanan; et al.
Favorite | TC[WOS]:2 TC[Scopus]:4 | Submit date:2024/11/05
Long Tailed Recognition  Vision-language Models  Zero-shot Recognition