UM  > Faculty of Science and Technology
Residential Collegefalse
Status已發表Published
DNN Surgery: Accelerating DNN Inference on the Edge through Layer Partitioning
Huanghuang Liang1; Qianlong Sang1; Chuang Hu1; Dazhao Cheng1; Xiaobo Zhou2; Dan Wang3; Wei Bao4; Yu Wang5
2023-03-20
Source PublicationIEEE Transactions on Cloud Computing
ISSN2168-7161
Volume11Issue:3Pages:3111-3125
Abstract

Recent advances in deep neural networks have substantially improved the accuracy and speed of various intelligent applications. Nevertheless, one obstacle is that DNN inference imposes a heavy computation burden on end devices, but offloading inference tasks to the cloud causes a large volume of data transmission. Motivated by the fact that the data size of some intermediate DNN layers is significantly smaller than that of raw input data, we designed the DNN surgery, which allows partitioned DNN to be processed at both the edge and cloud while limiting the data transmission. The challenge is twofold: (1) Network dynamics substantially influence the performance of DNN partition, and (2) State-of-the-art DNNs are characterized by a directed acyclic graph rather than a chain, so that partition is incredibly complicated. To solve the issues, We design a Dynamic Adaptive DNN Surgery(DADS) scheme, which optimally partitions the DNN under different network conditions. We also study the partition problem under the cost-constrained system, where the resource of the cloud for inference is limited. Then, a real-world prototype based on the selif-driving car video dataset is implemented, showing that compared with current approaches, DNN surgery can improve latency up to 6.45 times and improve throughput up to 8.31 times. We further evaluate DNN surgery through two case studies where we use DNN surgery to support an indoor intrusion detection application and a campus traffic monitor application, and DNN surgery shows consistently high throughput and low latency.

KeywordCloud Computing Computation Offloading Deep Learning Deep Neural Networks Delays Edge Computing Inference Acceleration Layer Partitioning Neural Networks Surgery Throughput Visual Analytics
DOI10.1109/TCC.2023.3258982
URLView the original
Indexed BySCIE
Language英語English
WOS Research AreaComputer Science
WOS SubjectComputer Science, Information systems;Computer Science, Software engineering;Computer Science, Theory & Methods
WOS IDWOS:001063436300062
PublisherIEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC445 HOES LANE, PISCATAWAY, NJ 08855-4141
Scopus ID2-s2.0-85151518708
Fulltext Access
Citation statistics
Document TypeJournal article
CollectionFaculty of Science and Technology
THE STATE KEY LABORATORY OF INTERNET OF THINGS FOR SMART CITY (UNIVERSITY OF MACAU)
DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE
Corresponding AuthorChuang Hu; Dazhao Cheng
Affiliation1.School of Computer Science, Wuhan University, Hubei, China
2.State Key Laboratory of Internet of Things for Smart City & the Department of Computer and Information Sciences, University of Macau, Macau, China
3.Department of Computing, The Hong Kong Polytechnic University, Hong Kong
4.School of Computer Science, The University of Sydney, Sydney, NSW, Australia
5.Department of Computer and Information Sciences, Temple University, Philadelphia, PA, USA
Recommended Citation
GB/T 7714
Huanghuang Liang,Qianlong Sang,Chuang Hu,et al. DNN Surgery: Accelerating DNN Inference on the Edge through Layer Partitioning[J]. IEEE Transactions on Cloud Computing, 2023, 11(3), 3111-3125.
APA Huanghuang Liang., Qianlong Sang., Chuang Hu., Dazhao Cheng., Xiaobo Zhou., Dan Wang., Wei Bao., & Yu Wang (2023). DNN Surgery: Accelerating DNN Inference on the Edge through Layer Partitioning. IEEE Transactions on Cloud Computing, 11(3), 3111-3125.
MLA Huanghuang Liang,et al."DNN Surgery: Accelerating DNN Inference on the Edge through Layer Partitioning".IEEE Transactions on Cloud Computing 11.3(2023):3111-3125.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Huanghuang Liang]'s Articles
[Qianlong Sang]'s Articles
[Chuang Hu]'s Articles
Baidu academic
Similar articles in Baidu academic
[Huanghuang Liang]'s Articles
[Qianlong Sang]'s Articles
[Chuang Hu]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Huanghuang Liang]'s Articles
[Qianlong Sang]'s Articles
[Chuang Hu]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.