Presentation
SuperFast: Fast Supernet Training using Initial Knowledge
DescriptionOnce-for-all NAS trains a supernet once and extracts specialized subnets from it for efficient multi-target deployment. However, the training cost is high, with SOTA ElasticViT and NASViT taking over 72 and 83 GPU days respectively. While approaches accelerated the training using supernet warm-up, we argue that this is suboptimal because knowledge is easier scaled upward than downward. Hence, we propose SuperFast, a simple workflow, that (I.) pretrains a subnet of the supernet, and (II.) distributes its knowledge within the supernet before training. Using SuperFast on ElasticViT and NASViT supernets achieves the baselines' accuracy 1.4x and 1.8x faster on ImageNet.
Event Type
Research Manuscript
TimeMonday, June 2311:30am - 11:45am PDT
Location3000, Level 3
AI
AI1: AI/ML Algorithms