Presentation
OASR-WFBP: An Overlapping Aware Startup Sharing Gradient Merging Strategy for Efficient Communication in Distributed Deep Learning
DescriptionWhile Wait-Free-Back-Propagation (WFBP) serves as a practical method in distributed deep-learning, it suffers from a large communication overhead. Ideally, the communication overhead can be reduced by overlapping the communication and computation of gradient and sharing the startup time among multiple gradient communication phases. However, existing optimizations choose to share the startup time greedily and fail to coordinately exploit the overlapping opportunity between computation and communication. We propose an overlapping aware startup sharing WFBP (OASR-WFBP). An analytic model is designed to guide the sharing procedure. Evaluations show that OSAR-WFBP achieves a 5%-13% optimization in iteration time over the state-of-the-art WFBP algorithm.
Event Type
Work-in-Progress Poster
TimeWednesday, July 12th6:00pm - 7:00pm PDT
LocationLevel 2 Lobby
AI
Autonomous Systems
Cloud
Design
EDA
Embedded Systems
RISC-V
Security