UM  > Faculty of Science and Technology
Residential Collegefalse
Status已發表Published
An FPGA-Based Transformer Accelerator With Parallel Unstructured Sparsity Handling for Question-Answering Applications
Cao, Rujian1; Zhao, Zhongyu1; Un, Ka Fai1; Yu, Wei Han1; Martins, Rui P.1,2; Mak, Pui In1
2024-11
Source PublicationIEEE Transactions on Circuits and Systems II-Express Briefs
ISSN1549-7747
Volume71Issue:11Pages:4688-4692
Abstract

Dataflow management provides limited performance improvement to the transformer model due to its lesser weight reuse than the convolution neural network. The cosFormer reduced computational complexity while achieving comparable performance to the vanilla transformer for natural language processing tasks. However, the unstructured sparsity in the cosFormer makes it a challenge to be implemented efficiently. This brief proposes a parallel unstructured sparsity handling (PUSH) scheme to compute sparse-dense matrix multiplication (SDMM) efficiently. It transforms unstructured sparsity into structured sparsity and reduces the total memory access by balancing the memory accesses of the sparse and dense matrices in the SDMM. We also employ unstructured weight pruning cooperating with PUSH to further increase the structured sparsity of the model. Through verification on an FPGA platform, the proposed accelerator achieves a throughput of 2.82 TOPS and an energy efficiency of 144.8 GOPs/W for HotpotQA dataset with long sequences.

KeywordSparse Matrices Computational Modeling Transformers Hardware Energy Efficiency Circuits Throughput Dataflow Digital Accelerator Energy-efficient Field-programmable Gate Array (Fpga) Sparsity Transformer
DOI10.1109/TCSII.2024.3462560
URLView the original
Indexed BySCIE
Language英語English
WOS Research AreaEngineering
WOS SubjectEngineering, Electrical & Electronic
WOS IDWOS:001348293900026
PublisherIEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC, 445 HOES LANE, PISCATAWAY, NJ 08855-4141
Scopus ID2-s2.0-85204436132
Fulltext Access
Citation statistics
Document TypeJournal article
CollectionFaculty of Science and Technology
THE STATE KEY LABORATORY OF ANALOG AND MIXED-SIGNAL VLSI (UNIVERSITY OF MACAU)
INSTITUTE OF MICROELECTRONICS
DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING
Corresponding AuthorUn, Ka Fai
Affiliation1.University of Macau, State-Key Laboratory of Analog and Mixed-Signal VLSI, Institute of Microelectronics, Faculty of Science and Technology -ECE, Macao
2.Universidade de Lisboa, Instituto Superior Técnico, Portugal
First Author AffilicationFaculty of Science and Technology
Corresponding Author AffilicationFaculty of Science and Technology
Recommended Citation
GB/T 7714
Cao, Rujian,Zhao, Zhongyu,Un, Ka Fai,et al. An FPGA-Based Transformer Accelerator With Parallel Unstructured Sparsity Handling for Question-Answering Applications[J]. IEEE Transactions on Circuits and Systems II-Express Briefs, 2024, 71(11), 4688-4692.
APA Cao, Rujian., Zhao, Zhongyu., Un, Ka Fai., Yu, Wei Han., Martins, Rui P.., & Mak, Pui In (2024). An FPGA-Based Transformer Accelerator With Parallel Unstructured Sparsity Handling for Question-Answering Applications. IEEE Transactions on Circuits and Systems II-Express Briefs, 71(11), 4688-4692.
MLA Cao, Rujian,et al."An FPGA-Based Transformer Accelerator With Parallel Unstructured Sparsity Handling for Question-Answering Applications".IEEE Transactions on Circuits and Systems II-Express Briefs 71.11(2024):4688-4692.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Cao, Rujian]'s Articles
[Zhao, Zhongyu]'s Articles
[Un, Ka Fai]'s Articles
Baidu academic
Similar articles in Baidu academic
[Cao, Rujian]'s Articles
[Zhao, Zhongyu]'s Articles
[Un, Ka Fai]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Cao, Rujian]'s Articles
[Zhao, Zhongyu]'s Articles
[Un, Ka Fai]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.