Distributed Stochastic Optimization with Large Delays
Distributed Stochastic Optimization with Large Delays
The recent surge of breakthroughs in machine learning and artificial intelligence has sparked renewed interest in large-scale stochastic optimization problems that are universally considered hard. One of the most widely used methods for solving such problems is distributed asynchronous stochastic gradient descent (DASGD), a family of algorithms that result from …