Residential College | false |
Status | 已發表Published |
More Efficient and Locally Enhanced Transformer | |
Zhu,Zhefeng1; Qi,Ke1; Zhou,Yicong2; Chen,Wenbin1; Zhang,Jingdong3 | |
2023 | |
Conference Name | 29th International Conference on Neural Information Processing, ICONIP 2022 |
Source Publication | Communications in Computer and Information Science |
Volume | 1792 CCIS |
Pages | 86-97 |
Conference Date | NOV 22-26, 2022 |
Conference Place | Virtual, Online |
Publisher | Springer Science and Business Media Deutschland GmbH |
Abstract | Aiming at the problems of the expensive computational cost of Self-attention and cascaded Self-attention weakening local feature information in the current ViT model, the ESA (Efficient Self-attention) module for optimizing computational complexity and the LE (Locally Enhanced) module for enhancing local information are proposed. The ESA module sorts the attention intensity of the class token and patch tokens of each Transformer encoder in the ViT model, only retains the weight value of patch token strongly associated with the class token in the attention matrix, and reuses the attention matrix of adjacent layers, so as to reduce the calculation of the model and accelerate the reasoning of the model; the LE module parallels a Depth-wise convolution in each Transformer encoder, it enables Transformer to capture global feature information and strengthen local feature information at the same time, which effectively improves the image recognition rate. A large number of experiments are performed on common image recognition datasets such as Tiny ImageNet, CIFAR-10 and CIFAR-100, experimental results show that the proposed method performs better in recognition accuracy under the premise of less computation. |
Keyword | Efficient Self-attention Image Recognition Locally Enhanced Vit |
DOI | 10.1007/978-981-99-1642-9_8 |
URL | View the original |
Language | 英語English |
Scopus ID | 2-s2.0-85161683342 |
Fulltext Access | |
Citation statistics | |
Document Type | Conference paper |
Collection | Faculty of Science and Technology DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE |
Corresponding Author | Qi,Ke |
Affiliation | 1.Guangzhou University,Guangzhou,China 2.University of Macau,Taipa,Macao 3.South China Normal University,Guangzhou,China |
Recommended Citation GB/T 7714 | Zhu,Zhefeng,Qi,Ke,Zhou,Yicong,et al. More Efficient and Locally Enhanced Transformer[C]:Springer Science and Business Media Deutschland GmbH, 2023, 86-97. |
APA | Zhu,Zhefeng., Qi,Ke., Zhou,Yicong., Chen,Wenbin., & Zhang,Jingdong (2023). More Efficient and Locally Enhanced Transformer. Communications in Computer and Information Science, 1792 CCIS, 86-97. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment