Presentation
TSO: Boosting Rematerialization Training via Optimal Tensor Scheduling Optimization
DescriptionTraining large-scale models has become increasingly challenging because of the GPU memory wall problem.
Rematerialization in graph mode lacks universality, while eager mode is not efficient.
Moreover, existing methods are difficult for further research because of their trouble in secondary development.
In this paper, we propose TSO, a unified and efficient training framework to boost large-scale model rematerialization training via optimal tensor scheduling optimization which possesses universality and efficiency at the same time.
TSO is non-intrusive modification to current frameworks and achieves one-line code to use.
Experimental results demonstrate that TSO achieves better training efficiency than state-of-the-art methods, even 7.99× speedup compared to intrusive methods.
Rematerialization in graph mode lacks universality, while eager mode is not efficient.
Moreover, existing methods are difficult for further research because of their trouble in secondary development.
In this paper, we propose TSO, a unified and efficient training framework to boost large-scale model rematerialization training via optimal tensor scheduling optimization which possesses universality and efficiency at the same time.
TSO is non-intrusive modification to current frameworks and achieves one-line code to use.
Experimental results demonstrate that TSO achieves better training efficiency than state-of-the-art methods, even 7.99× speedup compared to intrusive methods.
Event Type
Networking
Work-in-Progress Poster
TimeSunday, June 226:00pm - 7:00pm PDT
LocationLevel 3 Lobby