Author of the publication

Can We Scale Transformers to Predict Parameters of Diverse ImageNet Models?

, , and . ICML, volume 202 of Proceedings of Machine Learning Research, page 17243-17259. PMLR, (2023)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Improved Asynchronous Parallel Optimization Analysis for Stochastic Incremental Methods., , and . J. Mach. Learn. Res., (2018)A Modern Take on the Bias-Variance Tradeoff in Neural Networks., , , , , , and . CoRR, (2018)A simpler approach to obtaining an O(1/t) convergence rate for the projected stochastic subgradient method, , and . CoRR, (2012)Bayesian Structure Learning with Generative Flow Networks, , , , , , and . arXiv preprint arXiv:2202.13903, (2022)A Tight and Unified Analysis of Gradient-Based Methods for a Whole Spectrum of Differentiable Games., , , and . AISTATS, volume 108 of Proceedings of Machine Learning Research, page 2863-2873. PMLR, (2020)On the Convergence of Continuous Constrained Optimization for Structure Learning., , , and . CoRR, (2020)Multiset-Equivariant Set Prediction with Approximate Implicit Differentiation., , , , and . ICLR, OpenReview.net, (2022)Nonparametric Partial Disentanglement via Mechanism Sparsity: Sparse Actions, Interventions and Sparse Temporal Dependencies., , , , , , and . CoRR, (2024)Geometry-Aware Universal Mirror-Prox., and . CoRR, (2020)Identifiability of Discretized Latent Coordinate Systems via Density Landmarks Detection., , , and . CoRR, (2023)