Author of the publication

On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima

, , , , and . (2016)cite arxiv:1609.04836Comment: Accepted as a conference paper at ICLR 2017.

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

The Natural Language Decathlon: Multitask Learning as Question Answering, , , and . (2018)cite arxiv:1806.08730.A Closer Look at Deep Learning Heuristics: Learning rate restarts, Warmup and Distillation., , , and . CoRR, (2018)On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima., , , , and . ICLR, OpenReview.net, (2017)On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima, , , , and . (2016)cite arxiv:1609.04836Comment: Accepted as a conference paper at ICLR 2017.Regularizing and Optimizing LSTM Language Models., , and . ICLR (Poster), OpenReview.net, (2018)adaQN: An Adaptive Quasi-Newton Algorithm for Training RNNs., and . ECML/PKDD (1), volume 9851 of Lecture Notes in Computer Science, page 1-16. Springer, (2016)Char2Subword: Extending the Subword Embedding Space Using Robust Character Compositionality., , , , , and . EMNLP (Findings), page 1640-1651. Association for Computational Linguistics, (2021)Global Capacity Measures for Deep ReLU Networks via Path Sampling., , , , , and . CoRR, (2019)Unsupervised Paraphrase Generation via Dynamic Blocking., , , , , and . CoRR, (2020)A second-order method for convex l1-regularized optimization with active-set prediction., , , and . Optim. Methods Softw., 31 (3): 605-621 (2016)