Dynamic Sparse Training via Balancing the Exploration-Exploitation Trade-off
DescriptionOver-parameterization of deep neural networks (DNNs) has shown high prediction accuracy for many applications. Although effective, the large number of parameters hinders its popularity on resource-limited devices and has an outsize environmental impact. Sparse training could significantly mitigate the training costs by reducing the model size. However, existing sparse training methods mainly use either random-based or greedy-based drop-and-grow strategies, resulting in low accuracy. In this work, we propose important weights Exploitation and coverage Exploration to characterize Dynamic Sparse Training. We further design an acquisition function and provide the theoretical guarantees for the proposed method and clarify its convergence property.
Event Type
Research Manuscript
TimeWednesday, July 12th4:10pm - 4:25pm PDT
Location3010, 3rd Floor
AI/ML Architecture Design