Residential College | false |
Status | 已發表Published |
GSLP-CIM: A 28-nm Globally Systolic and Locally Parallel CNN/Transformer Accelerator With Scalable and Reconfigurable eDRAM Compute-in-Memory Macro for Flexible Dataflow | |
Zhan, Yi1; Yu, Wei Han1![]() ![]() ![]() ![]() ![]() | |
2024-11 | |
Source Publication | IEEE Transactions on Circuits and Systems I-Regular Papers
![]() |
ISSN | 1549-8328 |
Abstract | This article reports a globally systolic and locally parallel (GSLP) convolutional NN (CNN) and Transformer accelerator based on the scalable and reconfigurable (SR) embedded dynamic random-access memory (eDRAM) compute-in-memory (CIM) macro. It features: 1) a GSLP architecture employs systolic CIM macros with the reconfigurable inter-CIM network to support flexible dataflow, including weight stationary (WS), output stationary (OS), and Row stationary (RS); 2) an SR-CIM macro features reconfigurable weight/input/output memory ratio to maximize the related data reuse in different dataflow; 3) a high-density 3T eDRAM-CIM cell to further improve the density of the accelerator; 4) an area-efficient in-memory accumulator (IMA) to save the area and power overhead of the digital accumulation in each CIM macro. Prototyped in 28-nm CMOS process, the proposed GSLP-CIM accelerator exhibits a 4b peak throughput density of 0.16 TOPS/mm2 and a 4b peak compute energy efficiency of 3.55 TOPS/W. Specifically, evaluated with ResNet-50@ImageNet and ViT-B@ImageNet, this work reaches the system throughput of 24.5 and 5.66 inferences per second (IPS), the system throughput density of 19.3 IPS/mm2 and 4.46 IPS/mm2 , the system compute energy efficiency of 423.9 inferences per watt (IPW) and 97.6 IPW, respectively. |
Keyword | Neural Network (Nn) Transformer Embedded Dynamic Random-access Memory (Edram) Compute-in-memory (Cim) Systolic Flexible Dataflow |
DOI | 10.1109/TCSI.2024.3497187 |
URL | View the original |
Indexed By | SCIE |
Language | 英語English |
WOS Research Area | Engineering |
WOS Subject | Engineering, Electrical & Electronic |
WOS ID | WOS:001362259500001 |
Publisher | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC, 445 HOES LANE, PISCATAWAY, NJ 08855-4141 |
Scopus ID | 2-s2.0-85210276205 |
Fulltext Access | |
Citation statistics | |
Document Type | Journal article |
Collection | Faculty of Science and Technology THE STATE KEY LABORATORY OF ANALOG AND MIXED-SIGNAL VLSI (UNIVERSITY OF MACAU) INSTITUTE OF MICROELECTRONICS DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING |
Corresponding Author | Yu, Wei Han |
Affiliation | 1.Faculty of Science and Technology, Department of Electrical and Computer Engineering, State Key Laboratory of Analog and Mixed-Signal VLSI, the Institute of Microelectronics, University of Macau, Macau, China 2.Institute of Microelectronics, University of Macau, Macau, China Instituto Superior Tecnico, Universidade de Lisboa, Lisbon, Portugal |
First Author Affilication | Faculty of Science and Technology |
Corresponding Author Affilication | Faculty of Science and Technology |
Recommended Citation GB/T 7714 | Zhan, Yi,Yu, Wei Han,Un, Ka Fai,et al. GSLP-CIM: A 28-nm Globally Systolic and Locally Parallel CNN/Transformer Accelerator With Scalable and Reconfigurable eDRAM Compute-in-Memory Macro for Flexible Dataflow[J]. IEEE Transactions on Circuits and Systems I-Regular Papers, 2024. |
APA | Zhan, Yi., Yu, Wei Han., Un, Ka Fai., Martins, Rui P.., & Mak, Pui In (2024). GSLP-CIM: A 28-nm Globally Systolic and Locally Parallel CNN/Transformer Accelerator With Scalable and Reconfigurable eDRAM Compute-in-Memory Macro for Flexible Dataflow. IEEE Transactions on Circuits and Systems I-Regular Papers. |
MLA | Zhan, Yi,et al."GSLP-CIM: A 28-nm Globally Systolic and Locally Parallel CNN/Transformer Accelerator With Scalable and Reconfigurable eDRAM Compute-in-Memory Macro for Flexible Dataflow".IEEE Transactions on Circuits and Systems I-Regular Papers (2024). |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment