Residential College | false |
Status | 已發表Published |
PIXEL: Prompt-based Zero-shot Hashing via Visual and Textual Semantic Alignment | |
Dong, Zeyu1; Long, Qingqing2; Zhou, Yihang2; Wang, Pengfei2; Zhu, Zhihong3; Luo, Xiao4; Wang, Yidong3; Wang, Pengyang5![]() | |
2024-11 | |
Conference Name | 33rd ACM International Conference on Information and Knowledge Management, CIKM 2024 |
Source Publication | International Conference on Information and Knowledge Management, Proceedings
![]() |
Pages | 487-496 |
Conference Date | 21-25 October 2024 |
Conference Place | Boise, Idaho |
Country | USA |
Publisher | Association for Computing Machinery |
Abstract | Zero-Shot Hashing (ZSH) has aroused significant attention due to its efficiency and generalizability in multi-modal retrieval scenarios, which aims to encode semantic information into hash codes without needing unseen labeled training samples. In addition to commonly used visual images as visual semantics and class labels as global semantics, the corresponding attribute descriptions contain critical local semantics with detailed information. However, most existing methods focus on leveraging the extracted attribute numerical values, without exploring the textual semantics in attribute descriptions. To bridge this gap, in this paper, we propose Prompt-based zero-shot hashing via vIsual and teXtual sEmantic aLignment, namely PIXEL. Concretely, we design the attribute prompt template depending on attribute descriptions to make the model capture the corresponding local semantics. Then, achieving the textual embedding and visual embedding, we proposed an alignment module to model the intra- and inter-class contrastive distances. In addition, the attribute-wise constraint and class-wise constraint are utilized to collaboratively learn the hash code, image representation, and visual attributes more effectively. Finally, extensive experimental results demonstrate the superiority of PIXEL. |
Keyword | Attribute Hashing Image Prompting Semantic Alignment Zero-shot Hashing |
DOI | 10.1145/3627673.3679747 |
URL | View the original |
Language | 英語English |
Scopus ID | 2-s2.0-85210037495 |
Fulltext Access | |
Citation statistics | |
Document Type | Conference paper |
Collection | THE STATE KEY LABORATORY OF INTERNET OF THINGS FOR SMART CITY (UNIVERSITY OF MACAU) |
Affiliation | 1.Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Computer Network Information Center, Chinese Academy of Sciences, Beijing, China 2.Computer Network Information Center, Chinese Academy of Sciences, University of the Chinese Academy of Sciences, Beijing, China 3.Peking University, Beijing, China 4.University of California, Los Angeles, Los Angeles, United States 5.University of Macau, Macau, Macao |
Recommended Citation GB/T 7714 | Dong, Zeyu,Long, Qingqing,Zhou, Yihang,et al. PIXEL: Prompt-based Zero-shot Hashing via Visual and Textual Semantic Alignment[C]:Association for Computing Machinery, 2024, 487-496. |
APA | Dong, Zeyu., Long, Qingqing., Zhou, Yihang., Wang, Pengfei., Zhu, Zhihong., Luo, Xiao., Wang, Yidong., Wang, Pengyang., & Zhou, Yuanchun (2024). PIXEL: Prompt-based Zero-shot Hashing via Visual and Textual Semantic Alignment. International Conference on Information and Knowledge Management, Proceedings, 487-496. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment