A 97.8 GOPS/W FPGA-Based Residual-Block-Aware CNN Accelerator Featuring Multi-Clock PW2 Pipeline and Adaptive-Resolution Quantization

doi:10.1109/TCSI.2024.3505299

UM > INSTITUTE OF MICROELECTRONICS

Residential College	false
Status	已發表Published
	A 97.8 GOPS/W FPGA-Based Residual-Block-Aware CNN Accelerator Featuring Multi-Clock PW2 Pipeline and Adaptive-Resolution Quantization
	Li, Jixuan 1; Li, Ke 1; Un, Ka Fai1 ; Yu, Wei Han1 ; Martins, Rui P.1,2 ; Mak, Pui In1
	2024-11
Source Publication	IEEE Transactions on Circuits and Systems I: Regular Papers
ISSN	1549-8328
Abstract	Enhancing the energy efficiency for the residual block is crucial for an energy-efficient deep neural network accelerator. This paper presents a multi-clock pointwise-pointwise (MCPW) technique to process the adjacent PW convolution layers across residual blocks, reducing up to 75.0% DRAM access for the intermediate feature maps while securing >88.1% processing element (PE) utilization. Moreover, we introduce a dual-precision packing (DPP) DSP array to compute multiple 4/8-bit multiplications in a shared DSP, improving the accuracy by 1.5% (ImageNet) using low-precision residual distillation (RD) with adaptive-resolution quantization. The DPP DSP and adaptive-resolution RD boost the DSP efficiency up to 4.0×, reduce DRAM access by 50.0%, and improve the throughput by >2.7×. We also propose a dynamic accumulator/multiplier (A/M) DSP reconfiguration scheme to dynamically adjust the level of parallelism along the input/output channel dimensions. It also increases the PE utilization by 1.8× for the depthwise (DW) convolution layers with 33% less hardware resource overhead. Implemented on Xilinx VC709, the proposed accelerator achieves PE utilization of >93.0%, a DSP efficiency gain of >2.9×, and a throughput improvement on benchmarked networks of 4.9× while exhibiting an energy efficiency of 97.8 GOPs/W and a normalized throughput of 1.18 GOPS/DSP.
Keyword	Convolutional Neural Network (Cnn) Digital Signal Processing (Dsp) Field-programmable Gate Array (Fpga) Processing Unit (Pe) Utilization Residual Block
DOI	10.1109/TCSI.2024.3505299
URL	View the original
Indexed By	SCIE
Language	英語English
WOS Research Area	Engineering
WOS Subject	Engineering, Electrical & Electronic
WOS ID	WOS:001367632400001
Publisher	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC, 445 HOES LANE, PISCATAWAY, NJ 08855-4141
Scopus ID	2-s2.0-85210960901
Fulltext Access	View Full-Text via DOI View Full-Text via Web of Science View Full-Text via Scopus
Citation statistics
Document Type	Journal article
Collection	INSTITUTE OF MICROELECTRONICS DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING
Corresponding Author	Un, Ka Fai
Affiliation	1.University of Macau, State-Key Laboratory of Analog and Mixed-Signal VLSI, Institute of Microelectronics and the Faculty of Science and Technology, ECE Department, Macau, Macao 2.Universidade de Lisboa, Instituto Superior Técnico, Lisbon, 1049-001, Portugal
First Author Affilication	Faculty of Science and Technology
Corresponding Author Affilication	Faculty of Science and Technology
Recommended Citation GB/T 7714	Li, Jixuan,Li, Ke,Un, Ka Fai,et al. A 97.8 GOPS/W FPGA-Based Residual-Block-Aware CNN Accelerator Featuring Multi-Clock PW2 Pipeline and Adaptive-Resolution Quantization[J]. IEEE Transactions on Circuits and Systems I: Regular Papers, 2024.
APA	Li, Jixuan., Li, Ke., Un, Ka Fai., Yu, Wei Han., Martins, Rui P.., & Mak, Pui In (2024). A 97.8 GOPS/W FPGA-Based Residual-Block-Aware CNN Accelerator Featuring Multi-Clock PW2 Pipeline and Adaptive-Resolution Quantization. IEEE Transactions on Circuits and Systems I: Regular Papers.
MLA	Li, Jixuan,et al."A 97.8 GOPS/W FPGA-Based Residual-Block-Aware CNN Accelerator Featuring Multi-Clock PW2 Pipeline and Adaptive-Resolution Quantization".IEEE Transactions on Circuits and Systems I: Regular Papers (2024).

Files in This Item:
There are no files associated with this item.

If you have any objections to this item, please fill out the form below and the administrator will contact you as soon as possible.
Content:
Email：	*
Affiliation No.
Verification Code:	Refresh

Any comments and suggestions are welcomed.
Title:	*
Content:
Email：	*
Verification Code:	Refresh