UM  > Faculty of Science and Technology
Residential Collegefalse
Status已發表Published
P3 ViT: A CIM-Based High-Utilization Architecture With Dynamic Pruning and Two-Way Ping-Pong Macro for Vision Transformer
Fu, Xiangqu1,2; Ren, Qirui1,2; Wu, Hao1,2; Xiang, Feibin1,2; Luo, Qing1,2; Yue, Jinshan1,2; Chen, Yong3; Zhang, Feng1,2
2023-09
Source PublicationIEEE Transactions on Circuits and Systems I: Regular Papers
ISSN1549-8328
Volume70Issue:12Pages:4938-4948
Abstract

Transformers have made remarkable contributions to natural language processing (NLP) and many other fields. Recently, transformer-based models have achieved state-of-the-art (SOTA) performance on computer vision tasks compared with traditional convolutional neural networks (CNNs). Unfortunately, existing CNN accelerators cannot efficiently support transformer due to the high computational overhead and redundant data accesses associated with the 'KQV' matrix operations in the transformer models. If the recently-developed NLP transformer accelerators are applied to the vision transformer (ViT) models, their efficiency would decrease due to three challenges. 1) Redundant data storage and access still exist in ViT data flow scheduling. 2) For matrix transposition in transformer models, the previous transpose-operation schemes lack flexibility, resulting in extra area overhead. 3) The sparse acceleration schemes for NLP in prior transformer accelerators cannot efficiently accelerate ViT with relatively fewer tokens. To overcome these challenges, we propose P ViT, a computing-in-memory (CIM)-based architecture, to efficiently accelerate ViT, achieving high utilization on data flow scheduling. There are three key contributions: 1) P3ViT architecture supports three ping-pong pipeline scheduling modes, involving inter-core parallel and intra-core ping-pong pipeline mode (IEP-IAP3), inter-core pipeline and parallel mode (IEP2), and full parallel mode, to eliminate redundant memory accesses. 2) A two-way ping-pong CIM macro is proposed, which can be configured to regular calculation mode and transpose calculation mode to adapt to both Q×K and A×V tasks. 3) P3ViT also runs a small prediction network. It prunes redundant tokens to be a standard number hierarchically and dynamically, enabling high-throughput and high-utilization attention computation. Measurements show that P3ViT achieves 1.13× higher energy efficiency than the state-of-the-art transformer accelerator and achieves 30.8× and 14.6× speedup compared to CPU and GPU.

KeywordAccelerator Cmos Computing-in-memory (Cim) Dynamic Prune Prediction Network Vision Transformer (Vit)
DOI10.1109/TCSI.2023.3315060
URLView the original
Indexed BySCIE
Language英語English
WOS Research AreaEngineering
WOS SubjectEngineering, Electrical & Electronic
WOS IDWOS:001078405200001
PublisherIEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC, 445 HOES LANE, PISCATAWAY, NJ 08855-4141
Scopus ID2-s2.0-85173318525
Fulltext Access
Citation statistics
Document TypeJournal article
CollectionFaculty of Science and Technology
THE STATE KEY LABORATORY OF ANALOG AND MIXED-SIGNAL VLSI (UNIVERSITY OF MACAU)
INSTITUTE OF MICROELECTRONICS
DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING
Corresponding AuthorYue, Jinshan; Zhang, Feng
Affiliation1.Institute of Microelectronics, Chinese Academy of Sciences (CAS), Laboratory of Microelectronics Device and Integrated Technology, Beijing, 100029, China
2.University of Chinese Academy of Sciences, School of Integrated Circuits, Beijing, 100049, China
3.University of Macau, State-Key Laboratory of Analog and Mixed-Signal Vlsi, The Faculty of Science and Technology, Department of Ece, Macau, Macao
Recommended Citation
GB/T 7714
Fu, Xiangqu,Ren, Qirui,Wu, Hao,et al. P3 ViT: A CIM-Based High-Utilization Architecture With Dynamic Pruning and Two-Way Ping-Pong Macro for Vision Transformer[J]. IEEE Transactions on Circuits and Systems I: Regular Papers, 2023, 70(12), 4938-4948.
APA Fu, Xiangqu., Ren, Qirui., Wu, Hao., Xiang, Feibin., Luo, Qing., Yue, Jinshan., Chen, Yong., & Zhang, Feng (2023). P3 ViT: A CIM-Based High-Utilization Architecture With Dynamic Pruning and Two-Way Ping-Pong Macro for Vision Transformer. IEEE Transactions on Circuits and Systems I: Regular Papers, 70(12), 4938-4948.
MLA Fu, Xiangqu,et al."P3 ViT: A CIM-Based High-Utilization Architecture With Dynamic Pruning and Two-Way Ping-Pong Macro for Vision Transformer".IEEE Transactions on Circuits and Systems I: Regular Papers 70.12(2023):4938-4948.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Fu, Xiangqu]'s Articles
[Ren, Qirui]'s Articles
[Wu, Hao]'s Articles
Baidu academic
Similar articles in Baidu academic
[Fu, Xiangqu]'s Articles
[Ren, Qirui]'s Articles
[Wu, Hao]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Fu, Xiangqu]'s Articles
[Ren, Qirui]'s Articles
[Wu, Hao]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.