UM  > INSTITUTE OF MICROELECTRONICS
Residential Collegefalse
Status已發表Published
A 97.8 GOPS/W FPGA-Based Residual-Block-Aware CNN Accelerator Featuring Multi-Clock PW2 Pipeline and Adaptive-Resolution Quantization
Li, Jixuan1; Li, Ke1; Un, Ka Fai1; Yu, Wei Han1; Martins, Rui P.1,2; Mak, Pui In1
2024-11
Source PublicationIEEE Transactions on Circuits and Systems I: Regular Papers
ISSN1549-8328
Abstract

Enhancing the energy efficiency for the residual block is crucial for an energy-efficient deep neural network accelerator. This paper presents a multi-clock pointwise-pointwise (MCPW) technique to process the adjacent PW convolution layers across residual blocks, reducing up to 75.0% DRAM access for the intermediate feature maps while securing >88.1% processing element (PE) utilization. Moreover, we introduce a dual-precision packing (DPP) DSP array to compute multiple 4/8-bit multiplications in a shared DSP, improving the accuracy by 1.5% (ImageNet) using low-precision residual distillation (RD) with adaptive-resolution quantization. The DPP DSP and adaptive-resolution RD boost the DSP efficiency up to 4.0×, reduce DRAM access by 50.0%, and improve the throughput by >2.7×. We also propose a dynamic accumulator/multiplier (A/M) DSP reconfiguration scheme to dynamically adjust the level of parallelism along the input/output channel dimensions. It also increases the PE utilization by 1.8× for the depthwise (DW) convolution layers with 33% less hardware resource overhead. Implemented on Xilinx VC709, the proposed accelerator achieves PE utilization of >93.0%, a DSP efficiency gain of >2.9×, and a throughput improvement on benchmarked networks of 4.9× while exhibiting an energy efficiency of 97.8 GOPs/W and a normalized throughput of 1.18 GOPS/DSP.

KeywordConvolutional Neural Network (Cnn) Digital Signal Processing (Dsp) Field-programmable Gate Array (Fpga) Processing Unit (Pe) Utilization Residual Block
DOI10.1109/TCSI.2024.3505299
URLView the original
Indexed BySCIE
Language英語English
WOS Research AreaEngineering
WOS SubjectEngineering, Electrical & Electronic
WOS IDWOS:001367632400001
PublisherIEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC, 445 HOES LANE, PISCATAWAY, NJ 08855-4141
Scopus ID2-s2.0-85210960901
Fulltext Access
Citation statistics
Document TypeJournal article
CollectionINSTITUTE OF MICROELECTRONICS
DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING
Corresponding AuthorUn, Ka Fai
Affiliation1.University of Macau, State-Key Laboratory of Analog and Mixed-Signal VLSI, Institute of Microelectronics and the Faculty of Science and Technology, ECE Department, Macau, Macao
2.Universidade de Lisboa, Instituto Superior Técnico, Lisbon, 1049-001, Portugal
First Author AffilicationFaculty of Science and Technology
Corresponding Author AffilicationFaculty of Science and Technology
Recommended Citation
GB/T 7714
Li, Jixuan,Li, Ke,Un, Ka Fai,et al. A 97.8 GOPS/W FPGA-Based Residual-Block-Aware CNN Accelerator Featuring Multi-Clock PW2 Pipeline and Adaptive-Resolution Quantization[J]. IEEE Transactions on Circuits and Systems I: Regular Papers, 2024.
APA Li, Jixuan., Li, Ke., Un, Ka Fai., Yu, Wei Han., Martins, Rui P.., & Mak, Pui In (2024). A 97.8 GOPS/W FPGA-Based Residual-Block-Aware CNN Accelerator Featuring Multi-Clock PW2 Pipeline and Adaptive-Resolution Quantization. IEEE Transactions on Circuits and Systems I: Regular Papers.
MLA Li, Jixuan,et al."A 97.8 GOPS/W FPGA-Based Residual-Block-Aware CNN Accelerator Featuring Multi-Clock PW2 Pipeline and Adaptive-Resolution Quantization".IEEE Transactions on Circuits and Systems I: Regular Papers (2024).
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Li, Jixuan]'s Articles
[Li, Ke]'s Articles
[Un, Ka Fai]'s Articles
Baidu academic
Similar articles in Baidu academic
[Li, Jixuan]'s Articles
[Li, Ke]'s Articles
[Un, Ka Fai]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Li, Jixuan]'s Articles
[Li, Ke]'s Articles
[Un, Ka Fai]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.