Residential Collegefalse
Status已發表Published
PIXEL: Prompt-based Zero-shot Hashing via Visual and Textual Semantic Alignment
Dong, Zeyu1; Long, Qingqing2; Zhou, Yihang2; Wang, Pengfei2; Zhu, Zhihong3; Luo, Xiao4; Wang, Yidong3; Wang, Pengyang5; Zhou, Yuanchun2
2024-11
Conference Name33rd ACM International Conference on Information and Knowledge Management, CIKM 2024
Source PublicationInternational Conference on Information and Knowledge Management, Proceedings
Pages487-496
Conference Date21-25 October 2024
Conference PlaceBoise, Idaho
CountryUSA
PublisherAssociation for Computing Machinery
Abstract

Zero-Shot Hashing (ZSH) has aroused significant attention due to its efficiency and generalizability in multi-modal retrieval scenarios, which aims to encode semantic information into hash codes without needing unseen labeled training samples. In addition to commonly used visual images as visual semantics and class labels as global semantics, the corresponding attribute descriptions contain critical local semantics with detailed information. However, most existing methods focus on leveraging the extracted attribute numerical values, without exploring the textual semantics in attribute descriptions. To bridge this gap, in this paper, we propose Prompt-based zero-shot hashing via vIsual and teXtual sEmantic aLignment, namely PIXEL. Concretely, we design the attribute prompt template depending on attribute descriptions to make the model capture the corresponding local semantics. Then, achieving the textual embedding and visual embedding, we proposed an alignment module to model the intra- and inter-class contrastive distances. In addition, the attribute-wise constraint and class-wise constraint are utilized to collaboratively learn the hash code, image representation, and visual attributes more effectively. Finally, extensive experimental results demonstrate the superiority of PIXEL.

KeywordAttribute Hashing Image Prompting Semantic Alignment Zero-shot Hashing
DOI10.1145/3627673.3679747
URLView the original
Language英語English
Scopus ID2-s2.0-85210037495
Fulltext Access
Citation statistics
Document TypeConference paper
CollectionTHE STATE KEY LABORATORY OF INTERNET OF THINGS FOR SMART CITY (UNIVERSITY OF MACAU)
Affiliation1.Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Computer Network Information Center, Chinese Academy of Sciences, Beijing, China
2.Computer Network Information Center, Chinese Academy of Sciences, University of the Chinese Academy of Sciences, Beijing, China
3.Peking University, Beijing, China
4.University of California, Los Angeles, Los Angeles, United States
5.University of Macau, Macau, Macao
Recommended Citation
GB/T 7714
Dong, Zeyu,Long, Qingqing,Zhou, Yihang,et al. PIXEL: Prompt-based Zero-shot Hashing via Visual and Textual Semantic Alignment[C]:Association for Computing Machinery, 2024, 487-496.
APA Dong, Zeyu., Long, Qingqing., Zhou, Yihang., Wang, Pengfei., Zhu, Zhihong., Luo, Xiao., Wang, Yidong., Wang, Pengyang., & Zhou, Yuanchun (2024). PIXEL: Prompt-based Zero-shot Hashing via Visual and Textual Semantic Alignment. International Conference on Information and Knowledge Management, Proceedings, 487-496.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Dong, Zeyu]'s Articles
[Long, Qingqing]'s Articles
[Zhou, Yihang]'s Articles
Baidu academic
Similar articles in Baidu academic
[Dong, Zeyu]'s Articles
[Long, Qingqing]'s Articles
[Zhou, Yihang]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Dong, Zeyu]'s Articles
[Long, Qingqing]'s Articles
[Zhou, Yihang]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.