GSLP-CIM: A 28-nm Globally Systolic and Locally Parallel CNN/Transformer Accelerator With Scalable and Reconfigurable eDRAM Compute-in-Memory Macro for Flexible Dataflow

doi:10.1109/TCSI.2024.3497187

UM > Faculty of Science and Technology

Residential College	false
Status	已發表Published
	GSLP-CIM: A 28-nm Globally Systolic and Locally Parallel CNN/Transformer Accelerator With Scalable and Reconfigurable eDRAM Compute-in-Memory Macro for Flexible Dataflow
	Zhan, Yi 1; Yu, Wei Han1 ; Un, Ka Fai1 ; Martins, Rui P.1,2 ; Mak, Pui In1
	2024-11
Source Publication	IEEE Transactions on Circuits and Systems I-Regular Papers
ISSN	1549-8328
Abstract	This article reports a globally systolic and locally parallel (GSLP) convolutional NN (CNN) and Transformer accelerator based on the scalable and reconfigurable (SR) embedded dynamic random-access memory (eDRAM) compute-in-memory (CIM) macro. It features: 1) a GSLP architecture employs systolic CIM macros with the reconfigurable inter-CIM network to support flexible dataflow, including weight stationary (WS), output stationary (OS), and Row stationary (RS); 2) an SR-CIM macro features reconfigurable weight/input/output memory ratio to maximize the related data reuse in different dataflow; 3) a high-density 3T eDRAM-CIM cell to further improve the density of the accelerator; 4) an area-efficient in-memory accumulator (IMA) to save the area and power overhead of the digital accumulation in each CIM macro. Prototyped in 28-nm CMOS process, the proposed GSLP-CIM accelerator exhibits a 4b peak throughput density of 0.16 TOPS/mm² and a 4b peak compute energy efficiency of 3.55 TOPS/W. Specifically, evaluated with ResNet-50@ImageNet and ViT-B@ImageNet, this work reaches the system throughput of 24.5 and 5.66 inferences per second (IPS), the system throughput density of 19.3 IPS/mm² and 4.46 IPS/mm² , the system compute energy efficiency of 423.9 inferences per watt (IPW) and 97.6 IPW, respectively.
Keyword	Neural Network (Nn) Transformer Embedded Dynamic Random-access Memory (Edram) Compute-in-memory (Cim) Systolic Flexible Dataflow
DOI	10.1109/TCSI.2024.3497187
URL	View the original
Indexed By	SCIE
Language	英語English
WOS Research Area	Engineering
WOS Subject	Engineering, Electrical & Electronic
WOS ID	WOS:001362259500001
Publisher	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC, 445 HOES LANE, PISCATAWAY, NJ 08855-4141
Scopus ID	2-s2.0-85210276205
Fulltext Access	View Full-Text via DOI View Full-Text via Web of Science View Full-Text via Scopus
Citation statistics
Document Type	Journal article
Collection	Faculty of Science and Technology THE STATE KEY LABORATORY OF ANALOG AND MIXED-SIGNAL VLSI (UNIVERSITY OF MACAU) INSTITUTE OF MICROELECTRONICS DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING
Corresponding Author	Yu, Wei Han
Affiliation	1.Faculty of Science and Technology, Department of Electrical and Computer Engineering, State Key Laboratory of Analog and Mixed-Signal VLSI, the Institute of Microelectronics, University of Macau, Macau, China 2.Institute of Microelectronics, University of Macau, Macau, China Instituto Superior Tecnico, Universidade de Lisboa, Lisbon, Portugal
First Author Affilication	Faculty of Science and Technology
Corresponding Author Affilication	Faculty of Science and Technology
Recommended Citation GB/T 7714	Zhan, Yi,Yu, Wei Han,Un, Ka Fai,et al. GSLP-CIM: A 28-nm Globally Systolic and Locally Parallel CNN/Transformer Accelerator With Scalable and Reconfigurable eDRAM Compute-in-Memory Macro for Flexible Dataflow[J]. IEEE Transactions on Circuits and Systems I-Regular Papers, 2024.
APA	Zhan, Yi., Yu, Wei Han., Un, Ka Fai., Martins, Rui P.., & Mak, Pui In (2024). GSLP-CIM: A 28-nm Globally Systolic and Locally Parallel CNN/Transformer Accelerator With Scalable and Reconfigurable eDRAM Compute-in-Memory Macro for Flexible Dataflow. IEEE Transactions on Circuits and Systems I-Regular Papers.
MLA	Zhan, Yi,et al."GSLP-CIM: A 28-nm Globally Systolic and Locally Parallel CNN/Transformer Accelerator With Scalable and Reconfigurable eDRAM Compute-in-Memory Macro for Flexible Dataflow".IEEE Transactions on Circuits and Systems I-Regular Papers (2024).

Files in This Item:
There are no files associated with this item.

If you have any objections to this item, please fill out the form below and the administrator will contact you as soon as possible.
Content:
Email：	*
Affiliation No.
Verification Code:	Refresh

Any comments and suggestions are welcomed.
Title:	*
Content:
Email：	*
Verification Code:	Refresh