- The document discusses the convergence rate of gradient descent for minimizing strongly convex and smooth loss functions.
- It shows that under suitable step sizes, the gradient descent iterates converge geometrically fast to the optimal solution.
- It then applies this analysis to the generalized linear model, showing that gradient descent can estimate the true parameter within radius proportional to the statistical error, provided enough iterations.