Zusammenfassung
We consider derivative-free algorithms for stochastic and non-stochastic
convex optimization problems that use only function values rather than
gradients. Focusing on non-asymptotic bounds on convergence rates, we show that
if pairs of function values are available, algorithms for \$d\$-dimensional
optimization that use gradient estimates based on random perturbations suffer a
factor of at most \$d\$ in convergence rate over traditional stochastic
gradient methods. We establish such results for both smooth and non-smooth
cases, sharpening previous analyses that suggested a worse dimension
dependence, and extend our results to the case of multiple (\$m 2\$)
evaluations. We complement our algorithmic development with
information-theoretic lower bounds on the minimax convergence rate of such
problems, establishing the sharpness of our achievable results up to constant
(sometimes logarithmic) factors.
Nutzer