P3 ViT: A CIM-Based High-Utilization Architecture With Dynamic Pruning and Two-Way Ping-Pong Macro for Vision Transformer

doi:10.1109/TCSI.2023.3315060

UM > Faculty of Science and Technology

Residential College	false
Status	已發表Published
	P3 ViT: A CIM-Based High-Utilization Architecture With Dynamic Pruning and Two-Way Ping-Pong Macro for Vision Transformer
	Fu, Xiangqu 1,2; Ren, Qirui 1,2; Wu, Hao 1,2; Xiang, Feibin 1,2; Luo, Qing 1,2; Yue, Jinshan 1,2; Chen, Yong3 ; Zhang, Feng 1,2
	2023-09
Source Publication	IEEE Transactions on Circuits and Systems I: Regular Papers
ISSN	1549-8328
Volume	70 Issue:12 Pages:4938-4948
Abstract	Transformers have made remarkable contributions to natural language processing (NLP) and many other fields. Recently, transformer-based models have achieved state-of-the-art (SOTA) performance on computer vision tasks compared with traditional convolutional neural networks (CNNs). Unfortunately, existing CNN accelerators cannot efficiently support transformer due to the high computational overhead and redundant data accesses associated with the 'KQV' matrix operations in the transformer models. If the recently-developed NLP transformer accelerators are applied to the vision transformer (ViT) models, their efficiency would decrease due to three challenges. 1) Redundant data storage and access still exist in ViT data flow scheduling. 2) For matrix transposition in transformer models, the previous transpose-operation schemes lack flexibility, resulting in extra area overhead. 3) The sparse acceleration schemes for NLP in prior transformer accelerators cannot efficiently accelerate ViT with relatively fewer tokens. To overcome these challenges, we propose P ViT, a computing-in-memory (CIM)-based architecture, to efficiently accelerate ViT, achieving high utilization on data flow scheduling. There are three key contributions: 1) P3ViT architecture supports three ping-pong pipeline scheduling modes, involving inter-core parallel and intra-core ping-pong pipeline mode (IEP-IAP3), inter-core pipeline and parallel mode (IEP2), and full parallel mode, to eliminate redundant memory accesses. 2) A two-way ping-pong CIM macro is proposed, which can be configured to regular calculation mode and transpose calculation mode to adapt to both Q×K and A×V tasks. 3) P3ViT also runs a small prediction network. It prunes redundant tokens to be a standard number hierarchically and dynamically, enabling high-throughput and high-utilization attention computation. Measurements show that P3ViT achieves 1.13× higher energy efficiency than the state-of-the-art transformer accelerator and achieves 30.8× and 14.6× speedup compared to CPU and GPU.
Keyword	Accelerator Cmos Computing-in-memory (Cim) Dynamic Prune Prediction Network Vision Transformer (Vit)
DOI	10.1109/TCSI.2023.3315060
URL	View the original
Indexed By	SCIE
Language	英語English
WOS Research Area	Engineering
WOS Subject	Engineering, Electrical & Electronic
WOS ID	WOS:001078405200001
Publisher	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC, 445 HOES LANE, PISCATAWAY, NJ 08855-4141
Scopus ID	2-s2.0-85173318525
Fulltext Access	View Full-Text via DOI View Full-Text via Web of Science View Full-Text via Scopus
Citation statistics
Document Type	Journal article
Collection	Faculty of Science and Technology THE STATE KEY LABORATORY OF ANALOG AND MIXED-SIGNAL VLSI (UNIVERSITY OF MACAU) INSTITUTE OF MICROELECTRONICS DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING
Corresponding Author	Yue, Jinshan; Zhang, Feng
Affiliation	1.Institute of Microelectronics, Chinese Academy of Sciences (CAS), Laboratory of Microelectronics Device and Integrated Technology, Beijing, 100029, China 2.University of Chinese Academy of Sciences, School of Integrated Circuits, Beijing, 100049, China 3.University of Macau, State-Key Laboratory of Analog and Mixed-Signal Vlsi, The Faculty of Science and Technology, Department of Ece, Macau, Macao
Recommended Citation GB/T 7714	Fu, Xiangqu,Ren, Qirui,Wu, Hao,et al. P3 ViT: A CIM-Based High-Utilization Architecture With Dynamic Pruning and Two-Way Ping-Pong Macro for Vision Transformer[J]. IEEE Transactions on Circuits and Systems I: Regular Papers, 2023, 70(12), 4938-4948.
APA	Fu, Xiangqu., Ren, Qirui., Wu, Hao., Xiang, Feibin., Luo, Qing., Yue, Jinshan., Chen, Yong., & Zhang, Feng (2023). P3 ViT: A CIM-Based High-Utilization Architecture With Dynamic Pruning and Two-Way Ping-Pong Macro for Vision Transformer. IEEE Transactions on Circuits and Systems I: Regular Papers, 70(12), 4938-4948.
MLA	Fu, Xiangqu,et al."P3 ViT: A CIM-Based High-Utilization Architecture With Dynamic Pruning and Two-Way Ping-Pong Macro for Vision Transformer".IEEE Transactions on Circuits and Systems I: Regular Papers 70.12(2023):4938-4948.

Files in This Item:
There are no files associated with this item.

If you have any objections to this item, please fill out the form below and the administrator will contact you as soon as possible.
Content:
Email：	*
Affiliation No.
Verification Code:	Refresh

Any comments and suggestions are welcomed.
Title:	*
Content:
Email：	*
Verification Code:	Refresh