More Efficient and Locally Enhanced Transformer

doi:10.1007/978-981-99-1642-9_8

UM > Faculty of Science and Technology

Residential College	false
Status	已發表Published
	More Efficient and Locally Enhanced Transformer
	Zhu,Zhefeng 1; Qi,Ke 1; Zhou,Yicong2 ; Chen,Wenbin 1; Zhang,Jingdong 3
	2023
Conference Name	29th International Conference on Neural Information Processing, ICONIP 2022
Source Publication	Communications in Computer and Information Science
Volume	1792 CCIS
Pages	86-97
Conference Date	NOV 22-26, 2022
Conference Place	Virtual, Online
Publisher	Springer Science and Business Media Deutschland GmbH
Abstract	Aiming at the problems of the expensive computational cost of Self-attention and cascaded Self-attention weakening local feature information in the current ViT model, the ESA (Efficient Self-attention) module for optimizing computational complexity and the LE (Locally Enhanced) module for enhancing local information are proposed. The ESA module sorts the attention intensity of the class token and patch tokens of each Transformer encoder in the ViT model, only retains the weight value of patch token strongly associated with the class token in the attention matrix, and reuses the attention matrix of adjacent layers, so as to reduce the calculation of the model and accelerate the reasoning of the model; the LE module parallels a Depth-wise convolution in each Transformer encoder, it enables Transformer to capture global feature information and strengthen local feature information at the same time, which effectively improves the image recognition rate. A large number of experiments are performed on common image recognition datasets such as Tiny ImageNet, CIFAR-10 and CIFAR-100, experimental results show that the proposed method performs better in recognition accuracy under the premise of less computation.
Keyword	Efficient Self-attention Image Recognition Locally Enhanced Vit
DOI	10.1007/978-981-99-1642-9_8
URL	View the original
Language	英語English
Scopus ID	2-s2.0-85161683342
Fulltext Access	View Full-Text via DOI View Full-Text via Scopus
Citation statistics
Document Type	Conference paper
Collection	Faculty of Science and Technology DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE
Corresponding Author	Qi,Ke
Affiliation	1.Guangzhou University,Guangzhou,China 2.University of Macau,Taipa,Macao 3.South China Normal University,Guangzhou,China
Recommended Citation GB/T 7714	Zhu,Zhefeng,Qi,Ke,Zhou,Yicong,et al. More Efficient and Locally Enhanced Transformer[C]:Springer Science and Business Media Deutschland GmbH, 2023, 86-97.
APA	Zhu,Zhefeng., Qi,Ke., Zhou,Yicong., Chen,Wenbin., & Zhang,Jingdong (2023). More Efficient and Locally Enhanced Transformer. Communications in Computer and Information Science, 1792 CCIS, 86-97.

Files in This Item:
There are no files associated with this item.

If you have any objections to this item, please fill out the form below and the administrator will contact you as soon as possible.
Content:
Email：	*
Affiliation No.
Verification Code:	Refresh

Any comments and suggestions are welcomed.
Title:	*
Content:
Email：	*
Verification Code:	Refresh