UM

Browse/Search Results:  1-2 of 2 Help

Selected(0)Clear Items/Page:    Sort:
InSS: An Intelligent Scheduling Orchestrator for Multi-GPU Inference with Spatio-Temporal Sharing Journal article
Han, Ziyi, Zhou, Ruiting, Xu, Chengzhong, Zeng, Yifan, Zhang, Renli. InSS: An Intelligent Scheduling Orchestrator for Multi-GPU Inference with Spatio-Temporal Sharing[J]. IEEE Transactions on Parallel and Distributed Systems, 2024, 35(10), 1735-1748.
Authors:  Han, Ziyi;  Zhou, Ruiting;  Xu, Chengzhong;  Zeng, Yifan;  Zhang, Renli
Favorite | TC[WOS]:0 TC[Scopus]:2  IF:5.6/4.5 | Submit date:2024/08/05
Dnn Inference  Gpu Resource Management  Online Scheduling  
Raptor-T: A Fused and Memory-Efficient Sparse Transformer for Long and Variable-Length Sequences Journal article
Wang, Hulin, Yang, Donglin, Xia, Yaqi, Zhang, Zheng, Wang, Qigang, Fan, Jianping, Zhou, Xiaobo, Cheng, Dazhao. Raptor-T: A Fused and Memory-Efficient Sparse Transformer for Long and Variable-Length Sequences[J]. IEEE TRANSACTIONS ON COMPUTERS, 2024, 73(7), 1852-1865.
Authors:  Wang, Hulin;  Yang, Donglin;  Xia, Yaqi;  Zhang, Zheng;  Wang, Qigang; et al.
Favorite | TC[WOS]:1 TC[Scopus]:1  IF:3.6/3.2 | Submit date:2024/05/16
Sparse Transformer  Inference Acceleration  Gpu  Deep Learning  Memory Optimization  Resource Management