Author of the publication

Implicit Bias of SGD for Diagonal Linear Networks: a Provable Benefit of Stochasticity.

, , and . NeurIPS, page 29218-29230. (2021)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Computational-Statistical Gaps in Gaussian Single-Index Models., , , and . CoRR, (2024)Last iterate convergence of SGD for Least-Squares in the Interpolation regime., , and . NeurIPS, page 21581-21591. (2021)Exponential convergence of testing error for stochastic gradient methods., , and . CoRR, (2017)SGD with large step sizes learns sparse features., , , and . CoRR, (2022)Statistical Optimality of Stochastic Gradient Descent on Hard Learning Problems through Multiple Passes., , and . NeurIPS, page 8125-8135. (2018)Label noise (stochastic) gradient descent implicitly solves the Lasso for quadratic parametrisation., , and . COLT, volume 178 of Proceedings of Machine Learning Research, page 2127-2159. PMLR, (2022)On Single Index Models beyond Gaussian Data., , and . CoRR, (2023)On Learning Gaussian Multi-index Models with Gradient Flow., , and . CoRR, (2023)Implicit Bias of SGD for Diagonal Linear Networks: a Provable Benefit of Stochasticity., , and . NeurIPS, page 29218-29230. (2021)Overcoming the curse of dimensionality with Laplacian regularization in semi-supervised learning., , , and . NeurIPS, page 30439-30451. (2021)