Author of the publication

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

On the Information Bottleneck Theory of Deep Learning., , , , , , and . ICLR (Poster), OpenReview.net, (2018)Distributional Generalization: A New Kind of Generalization., and . CoRR, (2020)Deep Double Descent: Where Bigger Models and More Data Hurt, , , , , and . (2019)cite arxiv:1912.02292Comment: G.K. and Y.B. contributed equally.Minnorm training: an algorithm for training over-parameterized deep neural networks., , , and . CoRR, (2018)Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models., , , , , , , , , and 31 other author(s). CoRR, (2023)On the information bottleneck theory of deep learning, , , and . (2018)The unreasonable effectiveness of few-shot learning for machine translation., , , , , , , and . CoRR, (2023)Revisiting Model Stitching to Compare Neural Representations., , and . NeurIPS, page 225-236. (2021)Data Scaling Laws in NMT: The Effect of Noise and Architecture., , , , , , and . ICML, volume 162 of Proceedings of Machine Learning Research, page 1466-1482. PMLR, (2022)Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modelling., , , , , , , , , and . CoRR, (2020)