Ask a Question

Prefer a chat interface with context about you and your work?

The Sample Complexity of Gradient Descent in Stochastic Convex Optimization

The Sample Complexity of Gradient Descent in Stochastic Convex Optimization

We analyze the sample complexity of full-batch Gradient Descent (GD) in the setup of non-smooth Stochastic Convex Optimization. We show that the generalization error of GD, with common choice of hyper-parameters, can be $\tilde \Theta(d/m + 1/\sqrt{m})$, where $d$ is the dimension and $m$ is the sample size. This matches …