Close

Presentation

ParGNN: A Scalable Graph Neural Network Training Framework on multi-GPUs
DescriptionFull-batch Graph Neural Networks (GNN) training is indispensable for interdisciplinary applications. Although full-batch training has advantages in convergence accuracy and speed, it still faces challenges such as severe load imbalance and high communication traffic overhead. In order to address these challenges, we propose ParGNN, an efficient full-batch training system for GNNs, which adopts a profiler-guided adaptive load balancing method in conjunction with graph over-partition to alleviate load imbalance. Based on the over-partition results, we present a subgraph pipeline algorithm to overlap communication and computation while maintaining the accuracy of GNN training. Extensive experiments demonstrate that ParGNN can not only obtain the highest accuracy but also reach the preset accuracy in the shortest time.
In the end-to-end experiments performed on the four datasets, ParGNN outperforms the two state-of-the-art full-batch GNN systems, PipeGCN and DGL, achieving the highest speedup of 2.7$\times$ and 21.8$\times$ times respectively.
Event Type
Research Manuscript
TimeMonday, June 2311:00am - 11:15am PDT
Location3000, Level 3
Topics
AI
Tracks
AI1: AI/ML Algorithms