Residential College | false |
Status | 已發表Published |
P3 ViT: A CIM-Based High-Utilization Architecture With Dynamic Pruning and Two-Way Ping-Pong Macro for Vision Transformer | |
Fu, Xiangqu1,2; Ren, Qirui1,2; Wu, Hao1,2; Xiang, Feibin1,2; Luo, Qing1,2; Yue, Jinshan1,2; Chen, Yong3; Zhang, Feng1,2 | |
2023-09 | |
Source Publication | IEEE Transactions on Circuits and Systems I: Regular Papers |
ISSN | 1549-8328 |
Volume | 70Issue:12Pages:4938-4948 |
Abstract | Transformers have made remarkable contributions to natural language processing (NLP) and many other fields. Recently, transformer-based models have achieved state-of-the-art (SOTA) performance on computer vision tasks compared with traditional convolutional neural networks (CNNs). Unfortunately, existing CNN accelerators cannot efficiently support transformer due to the high computational overhead and redundant data accesses associated with the 'KQV' matrix operations in the transformer models. If the recently-developed NLP transformer accelerators are applied to the vision transformer (ViT) models, their efficiency would decrease due to three challenges. 1) Redundant data storage and access still exist in ViT data flow scheduling. 2) For matrix transposition in transformer models, the previous transpose-operation schemes lack flexibility, resulting in extra area overhead. 3) The sparse acceleration schemes for NLP in prior transformer accelerators cannot efficiently accelerate ViT with relatively fewer tokens. To overcome these challenges, we propose P ViT, a computing-in-memory (CIM)-based architecture, to efficiently accelerate ViT, achieving high utilization on data flow scheduling. There are three key contributions: 1) P3ViT architecture supports three ping-pong pipeline scheduling modes, involving inter-core parallel and intra-core ping-pong pipeline mode (IEP-IAP3), inter-core pipeline and parallel mode (IEP2), and full parallel mode, to eliminate redundant memory accesses. 2) A two-way ping-pong CIM macro is proposed, which can be configured to regular calculation mode and transpose calculation mode to adapt to both Q×K and A×V tasks. 3) P3ViT also runs a small prediction network. It prunes redundant tokens to be a standard number hierarchically and dynamically, enabling high-throughput and high-utilization attention computation. Measurements show that P3ViT achieves 1.13× higher energy efficiency than the state-of-the-art transformer accelerator and achieves 30.8× and 14.6× speedup compared to CPU and GPU. |
Keyword | Accelerator Cmos Computing-in-memory (Cim) Dynamic Prune Prediction Network Vision Transformer (Vit) |
DOI | 10.1109/TCSI.2023.3315060 |
URL | View the original |
Indexed By | SCIE |
Language | 英語English |
WOS Research Area | Engineering |
WOS Subject | Engineering, Electrical & Electronic |
WOS ID | WOS:001078405200001 |
Publisher | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC, 445 HOES LANE, PISCATAWAY, NJ 08855-4141 |
Scopus ID | 2-s2.0-85173318525 |
Fulltext Access | |
Citation statistics | |
Document Type | Journal article |
Collection | Faculty of Science and Technology THE STATE KEY LABORATORY OF ANALOG AND MIXED-SIGNAL VLSI (UNIVERSITY OF MACAU) INSTITUTE OF MICROELECTRONICS DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING |
Corresponding Author | Yue, Jinshan; Zhang, Feng |
Affiliation | 1.Institute of Microelectronics, Chinese Academy of Sciences (CAS), Laboratory of Microelectronics Device and Integrated Technology, Beijing, 100029, China 2.University of Chinese Academy of Sciences, School of Integrated Circuits, Beijing, 100049, China 3.University of Macau, State-Key Laboratory of Analog and Mixed-Signal Vlsi, The Faculty of Science and Technology, Department of Ece, Macau, Macao |
Recommended Citation GB/T 7714 | Fu, Xiangqu,Ren, Qirui,Wu, Hao,et al. P3 ViT: A CIM-Based High-Utilization Architecture With Dynamic Pruning and Two-Way Ping-Pong Macro for Vision Transformer[J]. IEEE Transactions on Circuits and Systems I: Regular Papers, 2023, 70(12), 4938-4948. |
APA | Fu, Xiangqu., Ren, Qirui., Wu, Hao., Xiang, Feibin., Luo, Qing., Yue, Jinshan., Chen, Yong., & Zhang, Feng (2023). P3 ViT: A CIM-Based High-Utilization Architecture With Dynamic Pruning and Two-Way Ping-Pong Macro for Vision Transformer. IEEE Transactions on Circuits and Systems I: Regular Papers, 70(12), 4938-4948. |
MLA | Fu, Xiangqu,et al."P3 ViT: A CIM-Based High-Utilization Architecture With Dynamic Pruning and Two-Way Ping-Pong Macro for Vision Transformer".IEEE Transactions on Circuits and Systems I: Regular Papers 70.12(2023):4938-4948. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment