T. Ma. (2021)cite arxiv:2103.13462Comment: This is the Chapter 21 of the book "Beyond the Worst-Case Analysis of Algorithms".
Abstract
Non-convex optimization is ubiquitous in modern machine learning. Researchers
devise non-convex objective functions and optimize them using off-the-shelf
optimizers such as stochastic gradient descent and its variants, which leverage
the local geometry and update iteratively. Even though solving non-convex
functions is NP-hard in the worst case, the optimization quality in practice is
often not an issue -- optimizers are largely believed to find approximate
global minima. Researchers hypothesize a unified explanation for this
intriguing phenomenon: most of the local minima of the practically-used
objectives are approximately global minima. We rigorously formalize it for
concrete instances of machine learning problems.
Description
[2103.13462] Why Do Local Methods Solve Nonconvex Problems?
%0 Journal Article
%1 ma2021local
%A Ma, Tengyu
%D 2021
%K non-convex optimization survey
%T Why Do Local Methods Solve Nonconvex Problems?
%U http://arxiv.org/abs/2103.13462
%X Non-convex optimization is ubiquitous in modern machine learning. Researchers
devise non-convex objective functions and optimize them using off-the-shelf
optimizers such as stochastic gradient descent and its variants, which leverage
the local geometry and update iteratively. Even though solving non-convex
functions is NP-hard in the worst case, the optimization quality in practice is
often not an issue -- optimizers are largely believed to find approximate
global minima. Researchers hypothesize a unified explanation for this
intriguing phenomenon: most of the local minima of the practically-used
objectives are approximately global minima. We rigorously formalize it for
concrete instances of machine learning problems.
@article{ma2021local,
abstract = {Non-convex optimization is ubiquitous in modern machine learning. Researchers
devise non-convex objective functions and optimize them using off-the-shelf
optimizers such as stochastic gradient descent and its variants, which leverage
the local geometry and update iteratively. Even though solving non-convex
functions is NP-hard in the worst case, the optimization quality in practice is
often not an issue -- optimizers are largely believed to find approximate
global minima. Researchers hypothesize a unified explanation for this
intriguing phenomenon: most of the local minima of the practically-used
objectives are approximately global minima. We rigorously formalize it for
concrete instances of machine learning problems.},
added-at = {2021-04-06T12:03:03.000+0200},
author = {Ma, Tengyu},
biburl = {https://www.bibsonomy.org/bibtex/25e0cb37046c0f330ab1d7db2f7f97be5/kirk86},
description = {[2103.13462] Why Do Local Methods Solve Nonconvex Problems?},
interhash = {802d37e9988dd4f8bb1264c1382bdc41},
intrahash = {5e0cb37046c0f330ab1d7db2f7f97be5},
keywords = {non-convex optimization survey},
note = {cite arxiv:2103.13462Comment: This is the Chapter 21 of the book "Beyond the Worst-Case Analysis of Algorithms"},
timestamp = {2021-04-06T12:03:03.000+0200},
title = {Why Do Local Methods Solve Nonconvex Problems?},
url = {http://arxiv.org/abs/2103.13462},
year = 2021
}