MPipeMoE: Memory Efficient MoE for Pre-trained Models with Adaptive Pipeline Parallelism

doi:10.1109/IPDPS54959.2023.00026

UM > THE STATE KEY LABORATORY OF INTERNET OF THINGS FOR SMART CITY (UNIVERSITY OF MACAU)

Residential College	false
Status	已發表Published
	MPipeMoE: Memory Efficient MoE for Pre-trained Models with Adaptive Pipeline Parallelism
	Zhang, Zheng 1; Yang, Donglin 2; Xia, Yaqi 1; Ding, Liang 3; Tao, Dacheng 3; Zhou, Xiaobo4 ; Cheng, Dazhao 1
	2023-06
Conference Name	2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
Source Publication	Proceedings - 2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
Pages	167-177
Conference Date	15-19 May 2023
Conference Place	St. Petersburg
Publication Place	USA
Publisher	Institute of Electrical and Electronics Engineers Inc.
Abstract	Recently, Mixture-of-Experts (MoE) has become one of the most popular techniques to scale pre-trained models to extraordinarily large sizes. Dynamic activation of experts allows for conditional computation, increasing the number of parameters of neural networks, which is critical for absorbing the vast amounts of knowledge available in many deep learning areas. However, despite the existing system and algorithm optimizations, there are significant challenges to be tackled when it comes to the inefficiencies of communication and memory consumption. In this paper, we present the design and implementation of MPipeMoE, a high-performance library that accelerates MoE training with adaptive and memory-efficient pipeline parallelism. Inspired by that the MoE training procedure can be divided into multiple independent sub-stages, we design adaptive pipeline parallelism with an online algorithm to configure the granularity of the pipelining. Further, we analyze the memory footprint breakdown of MoE training and identify that activations and temporary buffers are the primary contributors to the overall memory footprint. Toward memory efficiency, we propose memory reusing strategies to reduce memory requirements by eliminating memory redundancies, and develop an adaptive selection component to determine the optimal strategy that considers both hardware capacities and model characteristics at runtime. We implement MPipeMoE upon PyTorch and evaluate it with common MoE models in a physical cluster consisting of 8 NVIDIA DGX A100 servers. Compared with the state-of-art approach, MPipeMoE achieves up to 2.8× speedup and reduces memory footprint by up to 47% in training large models.
Keyword	Mixture Of Experts Pipeline Parallelism Distributed Training Memory Efficiency
DOI	10.1109/IPDPS54959.2023.00026
URL	View the original
Indexed By	CPCI-S
Language	英語English
WOS Research Area	Computer Science
WOS Subject	Computer Science, Hardware & Architecture ; Computer Science, Software Engineering ; Computer Science, Theory & Methods
WOS ID	WOS:001035517300017
Scopus ID	2-s2.0-85166649221
Fulltext Access	View Full-Text via DOI View Full-Text via Web of Science
Citation statistics
Document Type	Conference paper
Collection	THE STATE KEY LABORATORY OF INTERNET OF THINGS FOR SMART CITY (UNIVERSITY OF MACAU)
Corresponding Author	Zhang, Zheng
Affiliation	1.Wuhan University 2.Nvidia Corp 3.JD.com Inc. 4.University of Macau
Recommended Citation GB/T 7714	Zhang, Zheng,Yang, Donglin,Xia, Yaqi,et al. MPipeMoE: Memory Efficient MoE for Pre-trained Models with Adaptive Pipeline Parallelism[C], USA:Institute of Electrical and Electronics Engineers Inc., 2023, 167-177.
APA	Zhang, Zheng., Yang, Donglin., Xia, Yaqi., Ding, Liang., Tao, Dacheng., Zhou, Xiaobo., & Cheng, Dazhao (2023). MPipeMoE: Memory Efficient MoE for Pre-trained Models with Adaptive Pipeline Parallelism. Proceedings - 2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 167-177.

Files in This Item:
There are no files associated with this item.

If you have any objections to this item, please fill out the form below and the administrator will contact you as soon as possible.
Content:
Email：	*
Affiliation No.
Verification Code:	Refresh

Any comments and suggestions are welcomed.
Title:	*
Content:
Email：	*
Verification Code:	Refresh