kirk86 > readings optimization

bookmarks (hide)2
display
all
bookmarks only
bookmarks per page
5
10
20
50
100
sort by
added at
title
RSS
BibTeX
XML

1Michel Goemans - Teaching
Math, Theoretical CS, Stats and Optimization courses
4 years ago by @kirk86
show all tags
course
lecture
mathematics
notes
optimization
readings
stats
website
courselecturemathematicsnotesoptimizationreadingsstatswebsite
(0)
copydelete
- community post
- history of this post
1Stochastic Approximation
Stochastic Approximation Method, Robbins & Monro
4 years ago by @kirk86
show all tags
approximate
notes
optimization
readings
slides
stochastic
tutorials
approximatenotesoptimizationreadingsslidesstochastictutorials
(0)
copydelete
- community post
- history of this post

⟨⟨
⟨
1
⟩
⟩⟩

publications (hide)136
display
all
publications only
publications per page
5
10
20
50
100
sort by
added at
title
author
publication date
entry type
help for advanced sorting...
RSS
BibTeX
RDF
more...

3Accelerating Smooth Games by Manipulating Spectral Shapes
W. Azizian, D. Scieur, I. Mitliagkas, S. Lacoste-Julien, and G. Gidel. (2020)cite arxiv:2001.00602.
4 years ago by @kirk86
show all tags
game-theory
geometry
optimization
readings
game-theorygeometryoptimizationreadings
(0)
copydeleteadd this publication to your clipboard
1A Unified Analysis of Gradient-Based Methods for a Whole Spectrum of Games
W. Azizian, I. Mitliagkas, S. Lacoste-Julien, and G. Gidel. (2019)cite arxiv:1906.05945.
4 years ago by @kirk86
show all tags
game-theory
generative-models
optimization
readings
game-theorygenerative-modelsoptimizationreadings
(0)
copydeleteadd this publication to your clipboard
3Inference Suboptimality in Variational Autoencoders
C. Cremer, X. Li, and D. Duvenaud. (2018)cite arxiv:1801.03558Comment: ICML.
4 years ago by @kirk86
show all tags
amortised
optimization
readings
stochastic
variational
amortisedoptimizationreadingsstochasticvariational
(0)
copydeleteadd this publication to your clipboard
1Stochastic Gradient Estimation
M. Fu. (2005)
4 years ago by @kirk86
show all tags
optimization
readings
optimizationreadings
(0)
copydeleteadd this publication to your clipboard
2Fisher Information and Natural Gradient Learning in Random Deep Networks
S. Amari, R. Karakida, and M. Oizumi. Proceedings of Machine Learning Research, volume 89 of Proceedings of Machine Learning Research, page 694--702. PMLR, (16--18 Apr 2019)
4 years ago by @kirk86
show all tags
deep-learning
geometry
information
optimization
readings
deep-learninggeometryinformationoptimizationreadings
(0)
copydeleteadd this publication to your clipboard
2Lower Bounds for Non-Convex Stochastic Optimization
Y. Arjevani, Y. Carmon, J. Duchi, D. Foster, N. Srebro, and B. Woodworth. (2019)cite arxiv:1912.02365.
4 years ago by @kirk86
show all tags
bounds
generalization
optimization
readings
boundsgeneralizationoptimizationreadings
(0)
copydeleteadd this publication to your clipboard
3Convex Optimization: Algorithms and Complexity
S. Bubeck. (2015)cite arxiv:1405.4980Comment: A previous version of the manuscript was titled "Theory of Convex Optimization for Machine Learning".
4 years ago by @kirk86
show all tags
convex
optimization
readings
convexoptimizationreadings
(0)
copydeleteadd this publication to your clipboard
3Convex Optimization: Algorithms and Complexity
S. Bubeck. (2015)
4 years ago by @kirk86
show all tags
convex
optimization
readings
survey
convexoptimizationreadingssurvey
(0)
copydeleteadd this publication to your clipboard
5A Modern Introduction to Online Learning
F. Orabona. (2019)cite arxiv:1912.13213.
4 years ago by @kirk86
show all tags
online-learning
optimization
readings
online-learningoptimizationreadings
(0)
copydeleteadd this publication to your clipboard
2The role of over-parametrization in generalization of neural networks
B. Neyshabur, Z. Li, S. Bhojanapalli, Y. LeCun, and N. Srebro. International Conference on Learning Representations, (2019)
4 years ago by @kirk86
show all tags
bounds
generalization
iclr2019
optimization
readings
theory
boundsgeneralizationiclr2019optimizationreadingstheory
(0)
copydeleteadd this publication to your clipboard
1Dual Space Preconditioning for Gradient Descent
C. Maddison, D. Paulin, Y. Teh, and A. Doucet. (2019)cite arxiv:1902.02257Comment: Major revision, including simpler equivalent conditions for dual relative smoothness and applications to exponential penalty functions and p-norm regression.
4 years ago by @kirk86
show all tags
optimization
readings
optimizationreadings
(0)
copydeleteadd this publication to your clipboard
1Generalized Variational Inference: Three arguments for deriving new Posteriors
J. Knoblauch, J. Jewson, and T. Damoulas. (2019)cite arxiv:1904.02063Comment: 103 pages, 23 figures (comprehensive revision of previous version).
4 years ago by @kirk86
show all tags
approximate
bayesian
optimization
readings
uncertainty
variational
approximatebayesianoptimizationreadingsuncertaintyvariational
(0)
copydeleteadd this publication to your clipboard
3Deep Ensembles: A Loss Landscape Perspective
S. Fort, H. Hu, and B. Lakshminarayanan. (2019)cite arxiv:1912.02757.
4 years ago by @kirk86
show all tags
generalization
optimization
readings
uncertainty
generalizationoptimizationreadingsuncertainty
(1)
copydeleteadd this publication to your clipboard
2Stochastic Variational Optimization
T. Bird, J. Kunze, and D. Barber. (2018)cite arxiv:1809.04855.
4 years ago by @kirk86
show all tags
bayesian
optimization
readings
uncertainty
variational
bayesianoptimizationreadingsuncertaintyvariational
(0)
copydeleteadd this publication to your clipboard
2High-dimensional graphs and variable selection with the Lasso
N. Meinshausen, and P. Bühlmann. (2006)cite arxiv:math/0608017Comment: Published at http://dx.doi.org/10.1214/009053606000000281 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org).
4 years ago by @kirk86
show all tags
graphs
optimization
readings
graphsoptimizationreadings
(0)
copydeleteadd this publication to your clipboard
1Gradient methods for minimizing composite objective function
Y. Nesterov. (2007)
4 years ago by @kirk86
show all tags
optimization
readings
optimizationreadings
(0)
copydeleteadd this publication to your clipboard
1Forward Amortized Inference for Likelihood-Free Variational Marginalization
L. Ambrogioni, U. Güclü, J. Berezutskaya, E. van den Borne, Y. Güclütürk, M. Hinne, E. Maris, and M. van Gerven. Proceedings of Machine Learning Research, volume 89 of Proceedings of Machine Learning Research, page 777--786. PMLR, (16--18 Apr 2019)
4 years ago by @kirk86
show all tags
amortised
optimization
readings
variational
amortisedoptimizationreadingsvariational
(0)
copydeleteadd this publication to your clipboard
3Adversarial Variational Optimization of Non-Differentiable Simulators
G. Louppe, J. Hermans, and K. Cranmer. Proceedings of Machine Learning Research, volume 89 of Proceedings of Machine Learning Research, page 1438--1447. PMLR, (16--18 Apr 2019)
4 years ago by @kirk86
show all tags
adversarial
optimization
readings
variational
adversarialoptimizationreadingsvariational
(0)
copydeleteadd this publication to your clipboard
2Universal Statistics of Fisher Information in Deep Neural Networks: Mean Field Approach
R. Karakida, S. Akaho, and S. ichi Amari. (2019)
4 years ago by @kirk86
show all tags
optimization
readings
variational
optimizationreadingsvariational
(0)
copydeleteadd this publication to your clipboard
2The composite absolute penalties family for grouped and hierarchical variable selection
P. Zhao, G. Rocha, and B. Yu. (2009)cite arxiv:0909.0411Comment: Published in at http://dx.doi.org/10.1214/07-AOS584 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org).
4 years ago by @kirk86
show all tags
bayesian
optimization
readings
variable-selection
bayesianoptimizationreadingsvariable-selection
(0)
copydeleteadd this publication to your clipboard
2Unreasonable Effectiveness of Learning Neural Networks: From Accessible States and Robust Ensembles to Basic Algorithmic Schemes
C. Baldassi, C. Borgs, J. Chayes, A. Ingrosso, C. Lucibello, L. Saglietti, and R. Zecchina. (2016)cite arxiv:1605.06444Comment: 31 pages (14 main text, 18 appendix), 12 figures (6 main text, 6 appendix).
4 years ago by @kirk86
show all tags
generalization
optimization
readings
robustness
generalizationoptimizationreadingsrobustness
(0)
copydeleteadd this publication to your clipboard
1Subdominant Dense Clusters Allow for Simple Learning and High Computational Performance in Neural Networks with Discrete Synapses
C. Baldassi, A. Ingrosso, C. Lucibello, L. Saglietti, and R. Zecchina. (2015)cite arxiv:1509.05753Comment: 11 pages, 4 figures (main text: 5 pages, 3 figures; Supplemental Material: 6 pages, 1 figure).
4 years ago by @kirk86
show all tags
generalization
optimization
readings
generalizationoptimizationreadings
(0)
copydeleteadd this publication to your clipboard
3Generalization Bounds in the Predict-then-Optimize Framework
O. Balghiti, A. Elmachtoub, P. Grigas, and A. Tewari. (2019)cite arxiv:1905.11488.
4 years ago by @kirk86
show all tags
bounds
generalization
learning
optimization
readings
boundsgeneralizationlearningoptimizationreadings
(0)
copydeleteadd this publication to your clipboard
2Regret to the Best vs. Regret to the Average
E. Even-Dar, M. Kearns, Y. Mansour, and J. Wortman. Learning Theory, page 233--247. Berlin, Heidelberg, Springer Berlin Heidelberg, (2007)
4 years ago by @kirk86
show all tags
online-learning
optimization
readings
online-learningoptimizationreadings
(0)
copydeleteadd this publication to your clipboard
1Dual Averaging Method for Regularized Stochastic Learning and Online Optimization
L. Xiao. page 2116-2124. (2019)
4 years ago by @kirk86
show all tags
best_paper
neurips2019
optimization
readings
best_paperneurips2019optimizationreadings
(0)
copydeleteadd this publication to your clipboard
7Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems
S. Bubeck, and N. Cesa-Bianchi. (2012)cite arxiv:1204.5721Comment: To appear in Foundations and Trends in Machine Learning.
4 years ago by @kirk86
show all tags
bandits
online-learning
optimization
readings
survey
banditsonline-learningoptimizationreadingssurvey
(0)
copydeleteadd this publication to your clipboard
2Bandit Algorithms
T. Lattimore, and C. Szepesvari. (2019)
4 years ago by @kirk86
show all tags
bandits
book
online-learning
optimization
readings
banditsbookonline-learningoptimizationreadings
(0)
copydeleteadd this publication to your clipboard
4Introduction to Multi-Armed Bandits
A. Slivkins. (2019)cite arxiv:1904.07272Comment: The manuscript is complete, but comments are very welcome! To be published with Foundations and Trends in Machine Learning.
4 years ago by @kirk86
show all tags
bandits
book
optimization
readings
survey
banditsbookoptimizationreadingssurvey
(0)
copydeleteadd this publication to your clipboard
2Soft robust solutions to possibilistic optimization problems
A. Kasperski, and P. Zielinski. (2019)cite arxiv:1912.01516.
4 years ago by @kirk86
show all tags
bayesian
optimization
probability
readings
robustness
bayesianoptimizationprobabilityreadingsrobustness
(0)
copydeleteadd this publication to your clipboard
1On the Heavy-Tailed Theory of Stochastic Gradient Descent for Deep Neural Networks
U. Şimşekli, M. Gürbüzbalaban, T. Nguyen, G. Richard, and L. Sagun. (2019)cite arxiv:1912.00018Comment: 32 pages. arXiv admin note: substantial text overlap with arXiv:1901.06053.
4 years ago by @kirk86
show all tags
bounds
deep-learning
optimization
readings
boundsdeep-learningoptimizationreadings
(0)
copydeleteadd this publication to your clipboard
1An empirical analysis of the optimization of deep network loss surfaces
D. Im, M. Tao, and K. Branson. (2016)cite arxiv:1612.04010.
4 years ago by @kirk86
show all tags
generalization
optimization
readings
generalizationoptimizationreadings
(0)
copydeleteadd this publication to your clipboard
2Derivative-Free Estimation of the Score Vector and Observed Information Matrix with Application to State-Space Models
A. Doucet, P. Jacob, and S. Rubenthaler. (2013)cite arxiv:1304.5768Comment: Technical report, 43 pages, 7 figures.
4 years ago by @kirk86
show all tags
bayesian
mcmc
optimization
probability
readings
stats
bayesianmcmcoptimizationprobabilityreadingsstats
(0)
copydeleteadd this publication to your clipboard
2A Tight Convergence Analysis for Stochastic Gradient Descent with Delayed Updates
Y. Arjevani, O. Shamir, and N. Srebro. (2018)cite arxiv:1806.10188.
4 years ago by @kirk86
show all tags
convergence
generalization
mathematics
optimization
readings
convergencegeneralizationmathematicsoptimizationreadings
(0)
copydeleteadd this publication to your clipboard
2First-Order Regret Analysis of Thompson Sampling
S. Bubeck, and M. Sellke. (2019)cite arxiv:1902.00681Comment: 27 pages.
4 years ago by @kirk86
show all tags
bayesian
bounds
combinatorics
online-learning
optimization
readings
sampling
bayesianboundscombinatoricsonline-learningoptimizationreadingssampling
(0)
copydeleteadd this publication to your clipboard
1Student Specialization in Deep ReLU Networks With Finite Width and Input Dimension
Y. Tian. (2019)cite arxiv:1909.13458.
4 years ago by @kirk86
show all tags
deep-learning
generalization
optimization
readings
deep-learninggeneralizationoptimizationreadings
(0)
copydeleteadd this publication to your clipboard
3Stochastic Gradient Descent as Approximate Bayesian Inference
S. Mandt, M. Hoffman, and D. Blei. (2017)cite arxiv:1704.04289Comment: 35 pages, published version (JMLR 2017).
5 years ago by @kirk86
show all tags
approximate
bayesian
optimization
readings
approximatebayesianoptimizationreadings
(0)
copydeleteadd this publication to your clipboard
3Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks
S. Arora, S. Du, W. Hu, Z. Li, and R. Wang. (2019)cite arxiv:1901.08584Comment: In ICML 2019.
5 years ago by @kirk86
show all tags
deep-learning
generalization
optimization
readings
theory
deep-learninggeneralizationoptimizationreadingstheory
(0)
copydeleteadd this publication to your clipboard
2Massively scalable Sinkhorn distances via the Nyström method
J. Altschuler, F. Bach, A. Rudi, and J. Niles-Weed. (2019)
5 years ago by @kirk86
show all tags
optimization
readings
regularisation
stable
optimizationreadingsregularisationstable
(0)
copydeleteadd this publication to your clipboard
3Generalization in Reinforcement Learning with Selective Noise Injection and Information Bottleneck
M. Igl, K. Ciosek, Y. Li, S. Tschiatschek, C. Zhang, S. Devlin, and K. Hofmann. (2019)cite arxiv:1910.12911Comment: Published at Neurips 2019.
5 years ago by @kirk86
show all tags
approximate
compression
generalization
information
optimization
readings
theory
approximatecompressiongeneralizationinformationoptimizationreadingstheory
(0)
copydeleteadd this publication to your clipboard
2Multi-Layer Convolutional Sparse Modeling: Pursuit and Dictionary Learning
J. Sulam, V. Papyan, Y. Romano, and M. Elad. (2017)cite arxiv:1708.08705.
5 years ago by @kirk86
show all tags
approximate
generalization
optimization
readings
sparsity
approximategeneralizationoptimizationreadingssparsity
(0)
copydeleteadd this publication to your clipboard
3Convolutional Neural Networks Analyzed via Convolutional Sparse Coding
V. Papyan, Y. Romano, and M. Elad. (2016)cite arxiv:1607.08194.
5 years ago by @kirk86
show all tags
approximate
generalization
optimization
readings
sparsity
theory
approximategeneralizationoptimizationreadingssparsitytheory
(0)
copydeleteadd this publication to your clipboard
2Convexified Convolutional Neural Networks
Y. Zhang, P. Liang, and M. Wainwright. (2016)cite arxiv:1609.01000Comment: 29 pages.
5 years ago by @kirk86
show all tags
deep-learning
learning
optimization
readings
theory
deep-learninglearningoptimizationreadingstheory
(0)
copydeleteadd this publication to your clipboard
5Deep Learning and the Information Bottleneck Principle
N. Tishby, and N. Zaslavsky. (2015)cite arxiv:1503.02406Comment: 5 pages, 2 figures, Invited paper to ITW 2015; 2015 IEEE Information Theory Workshop (ITW) (IEEE ITW 2015).
5 years ago by @kirk86
show all tags
compression
generalization
information
optimization
readings
sparsity
compressiongeneralizationinformationoptimizationreadingssparsity
(0)
copydeleteadd this publication to your clipboard
2Emergence of Invariance and Disentanglement in Deep Representations
A. Achille, and S. Soatto. (2017)cite arxiv:1706.01350Comment: Deep learning, neural network, representation, flat minima, information bottleneck, overfitting, generalization, sufficiency, minimality, sensitivity, information complexity, stochastic gradient descent, regularization, total correlation, PAC-Bayes.
5 years ago by @kirk86
show all tags
complexity
deep-learning
feature-selection
generalization
optimization
readings
sparsity
complexitydeep-learningfeature-selectiongeneralizationoptimizationreadingssparsity
(0)
copydeleteadd this publication to your clipboard
1Understanding Trainable Sparse Coding via Matrix Factorization
T. Moreau, and J. Bruna. (2016)cite arxiv:1609.00285Comment: Published as a conference paper at ICLR 2017.
5 years ago by @kirk86
show all tags
matrix-factorization
optimization
readings
sparsity
matrix-factorizationoptimizationreadingssparsity
(0)
copydeleteadd this publication to your clipboard
1Stochastic gradient Markov chain Monte Carlo
C. Nemeth, and P. Fearnhead. (2019)cite arxiv:1907.06986.
5 years ago by @kirk86
show all tags
mcmc
optimization
readings
sampling
stats
mcmcoptimizationreadingssamplingstats
(0)
copydeleteadd this publication to your clipboard
3Are deep ResNets provably better than linear predictors?
C. Yun, S. Sra, and A. Jadbabaie. (2019)cite arxiv:1907.03922Comment: 17 pages.
5 years ago by @kirk86
show all tags
generalization
objectives
optimization
readings
theory
generalizationobjectivesoptimizationreadingstheory
(0)
copydeleteadd this publication to your clipboard
2Empirical Analysis of the Hessian of Over-Parametrized Neural Networks
L. Sagun, U. Evci, V. Guney, Y. Dauphin, and L. Bottou. (2017)cite arxiv:1706.04454Comment: Minor update for ICLR 2018 Workshop Track presentation.
5 years ago by @kirk86
show all tags
dynamic
non-linear
optimization
readings
dynamicnon-linearoptimizationreadings
(0)
copydeleteadd this publication to your clipboard
1Identifying and attacking the saddle point problem in high-dimensional non-convex optimization
Y. Dauphin, R. Pascanu, C. Gulcehre, K. Cho, S. Ganguli, and Y. Bengio. (2014)cite arxiv:1406.2572Comment: The theoretical review and analysis in this article draw heavily from arXiv:1405.4604 cs.LG.
5 years ago by @kirk86
show all tags
optimization
readings
optimizationreadings
(0)
copydeleteadd this publication to your clipboard
1The Full Spectrum of Deepnet Hessians at Scale: Dynamics with SGD Training and Sample Size
V. Papyan. (2018)cite arxiv:1811.07062.
5 years ago by @kirk86
show all tags
dynamic
optimization
readings
dynamicoptimizationreadings
(0)
copydeleteadd this publication to your clipboard
1Qualitatively characterizing neural network optimization problems
I. Goodfellow, O. Vinyals, and A. Saxe. (2014)cite arxiv:1412.6544.
5 years ago by @kirk86
show all tags
optimization
readings
optimizationreadings
(0)
copydeleteadd this publication to your clipboard
2REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models
G. Tucker, A. Mnih, C. Maddison, D. Lawson, and J. Sohl-Dickstein. (2017)cite arxiv:1703.07370Comment: NIPS 2017.
5 years ago by @kirk86
show all tags
approximate
generalization
optimization
readings
approximategeneralizationoptimizationreadings
(0)
copydeleteadd this publication to your clipboard
2WNGrad: Learn the Learning Rate in Gradient Descent
X. Wu, R. Ward, and L. Bottou. (2018)cite arxiv:1803.02865Comment: 10 pages, 3 figures, conference.
5 years ago by @kirk86
show all tags
optimization
readings
optimizationreadings
(0)
copydeleteadd this publication to your clipboard
2An Exponential Learning Rate Schedule for Deep Learning
Z. Li, and S. Arora. (2019)cite arxiv:1910.07454Comment: 27 pages, 15 figures.
5 years ago by @kirk86
show all tags
objectives
optimization
readings
objectivesoptimizationreadings
(0)
copydeleteadd this publication to your clipboard
1Recovering Hidden Components in Multimodal Data with Composite Diffusion Operators
T. Shnitzer, M. Ben-Chen, L. Guibas, R. Talmon, and H. Wu. (2018)cite arxiv:1808.07312.
5 years ago by @kirk86
show all tags
diffusion
kernels
matrix-factorization
optimization
readings
diffusionkernelsmatrix-factorizationoptimizationreadings
(0)
copydeleteadd this publication to your clipboard
3Emergent properties of the local geometry of neural loss landscapes
S. Fort, and S. Ganguli. (2019)cite arxiv:1910.05929Comment: 10 pages, 8 figures.
5 years ago by @kirk86
show all tags
deep-learning
geometry
objectives
optimization
readings
deep-learninggeometryobjectivesoptimizationreadings
(0)
copydeleteadd this publication to your clipboard
3The Complexity of Making the Gradient Small in Stochastic Convex Optimization
D. Foster, A. Sekhari, O. Shamir, N. Srebro, K. Sridharan, and B. Woodworth. (2019)cite arxiv:1902.04686.
5 years ago by @kirk86
show all tags
optimization
readings
optimizationreadings
(0)
copydeleteadd this publication to your clipboard
2Polylogarithmic width suffices for gradient descent to achieve arbitrarily small test error with shallow ReLU networks
Z. Ji, and M. Telgarsky. (2019)cite arxiv:1909.12292.
5 years ago by @kirk86
show all tags
generalization
optimization
readings
theory
generalizationoptimizationreadingstheory
(0)
copydeleteadd this publication to your clipboard
2Mean Field Limit of the Learning Dynamics of Multilayer Neural Networks
P. Nguyen. (2019)cite arxiv:1902.02880.
5 years ago by @kirk86
show all tags
approximate
dynamic
generalization
optimization
probability
readings
approximatedynamicgeneralizationoptimizationprobabilityreadings
(0)
copydeleteadd this publication to your clipboard
3Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit
S. Mei, T. Misiakiewicz, and A. Montanari. (2019)cite arxiv:1902.06015Comment: 61 pages.
5 years ago by @kirk86
show all tags
approximate
dynamic
generalization
optimization
probability
readings
approximatedynamicgeneralizationoptimizationprobabilityreadings
(0)
copydeleteadd this publication to your clipboard
2A Mean Field View of the Landscape of Two-Layers Neural Networks
S. Mei, A. Montanari, and P. Nguyen. (2018)cite arxiv:1804.06561Comment: 103 pages.
5 years ago by @kirk86
show all tags
approximate
deep-learning
generalization
optimization
readings
approximatedeep-learninggeneralizationoptimizationreadings
(0)
copydeleteadd this publication to your clipboard
2Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent
J. Lee, L. Xiao, S. Schoenholz, Y. Bahri, R. Novak, J. Sohl-Dickstein, and J. Pennington. (2019)cite arxiv:1902.06720Comment: 11+17 pages, open-source code available at https://github.com/google/neural-tangents.
5 years ago by @kirk86
show all tags
deep-learning
generalization
kernels
optimization
readings
theory
deep-learninggeneralizationkernelsoptimizationreadingstheory
(0)
copydeleteadd this publication to your clipboard
1Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel
C. Wei, J. Lee, Q. Liu, and T. Ma. (2018)cite arxiv:1810.05369Comment: version 2: title changed from originally Ön the Margin Theory of Feedforward Neural Networks". Substantial changes from old version of paper, including a new lower bound on NTK sample complexity version 3: reorganized NTK lower bound proof.
5 years ago by @kirk86
show all tags
deep-learning
generalization
optimization
readings
theory
regularisation
deep-learninggeneralizationoptimizationreadingstheoryregularisation
(0)
copydeleteadd this publication to your clipboard
5Data-dependent Sample Complexity of Deep Neural Networks via Lipschitz Augmentation
C. Wei, and T. Ma. (2019)cite arxiv:1905.03684.
5 years ago by @kirk86
show all tags
bounds
deep-learning
generalization
optimization
probability
readings
stats
theory
boundsdeep-learninggeneralizationoptimizationprobabilityreadingsstatstheory
(0)
copydeleteadd this publication to your clipboard
3Logarithmic Regret for Online Control
N. Agarwal, E. Hazan, and K. Singh. (2019)cite arxiv:1909.05062.
5 years ago by @kirk86
show all tags
bounds
convergence
optimization
readings
boundsconvergenceoptimizationreadings
(0)
copydeleteadd this publication to your clipboard
2Optimality and Approximation with Policy Gradient Methods in Markov Decision Processes
A. Agarwal, S. Kakade, J. Lee, and G. Mahajan. (2019)cite arxiv:1908.00261Comment: Additional references and discussion of prior work.
5 years ago by @kirk86
show all tags
approximate
markov-processes
optimization
readings
reinforcement-learning
approximatemarkov-processesoptimizationreadingsreinforcement-learning
(0)
copydeleteadd this publication to your clipboard
7Opening the Black Box of Deep Neural Networks via Information
R. Shwartz-Ziv, and N. Tishby. (2017)cite arxiv:1703.00810Comment: 19 pages, 8 figures.
5 years ago by @kirk86
show all tags
approximate
compression
deep-learning
generalization
information
optimization
readings
theory
approximatecompressiondeep-learninggeneralizationinformationoptimizationreadingstheory
(0)
copydeleteadd this publication to your clipboard
1The Geometric Foundations of Hamiltonian Monte Carlo
M. Betancourt, S. Byrne, S. Livingstone, and M. Girolami. (2014)cite arxiv:1410.5110Comment: 45 pages, 13 figures.
5 years ago by @kirk86
show all tags
bayesian
geometry
mcmc
optimization
probability
readings
stats
bayesiangeometrymcmcoptimizationprobabilityreadingsstats
(0)
copydeleteadd this publication to your clipboard
3Monte Carlo Gradient Estimation in Machine Learning
S. Mohamed, M. Rosca, M. Figurnov, and A. Mnih. (2019)cite arxiv:1906.10652Comment: 59 pages, under review.
5 years ago by @kirk86
show all tags
bayesian
gradients
mcmc
optimization
probability
readings
stats
survey
bayesiangradientsmcmcoptimizationprobabilityreadingsstatssurvey
(0)
copydeleteadd this publication to your clipboard
4Implicit Regularization in Deep Matrix Factorization
S. Arora, N. Cohen, W. Hu, and Y. Luo. (2019)cite arxiv:1905.13655.
5 years ago by @kirk86
show all tags
matrix-factorization
optimization
readings
matrix-factorizationoptimizationreadings
(0)
copydeleteadd this publication to your clipboard
4Explaining Landscape Connectivity of Low-cost Solutions for Multilayer Nets
R. Kuditipudi, X. Wang, H. Lee, Y. Zhang, Z. Li, W. Hu, S. Arora, and R. Ge. (2019)cite arxiv:1906.06247.
5 years ago by @kirk86
show all tags
deep-learning
foundations
machine-learning
optimization
readings
theory
deep-learningfoundationsmachine-learningoptimizationreadingstheory
(0)
copydeleteadd this publication to your clipboard
3Fast and Simple Natural-Gradient Variational Inference with Mixture of Exponential-family Approximations
W. Lin, M. Khan, and M. Schmidt. (2019)cite arxiv:1906.02914Comment: Accepted as a conference paper at ICML 2019.
5 years ago by @kirk86
show all tags
deep-learning
flows
optimization
readings
stats
theory
variational
deep-learningflowsoptimizationreadingsstatstheoryvariational
(0)
copydeleteadd this publication to your clipboard
3On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport
L. Chizat, and F. Bach. (2018)cite arxiv:1805.09545Comment: Advances in Neural Information Processing Systems (NIPS), Dec 2018, Montréal, Canada.
5 years ago by @kirk86
show all tags
convergence
optimal-transport
optimization
readings
convergenceoptimal-transportoptimizationreadings
(0)
copydeleteadd this publication to your clipboard
1Algorithmic Regularization in Over-parameterized Matrix Sensing and Neural Networks with Quadratic Activations
Y. Li, T. Ma, and H. Zhang. (2017)cite arxiv:1712.09203Comment: COLT 2018 best paper; fixed minor missing steps in the previous version.
5 years ago by @kirk86
show all tags
deep-learning
foundations
machine-learning
optimization
readings
stable
theory
deep-learningfoundationsmachine-learningoptimizationreadingsstabletheory
(0)
copydeleteadd this publication to your clipboard
2The Implicit Bias of Gradient Descent on Separable Data
D. Soudry, E. Hoffer, M. Nacson, S. Gunasekar, and N. Srebro. (2017)cite arxiv:1710.10345Comment: Final JMLR version, with improved discussions over v3. Main improvements in journal version over conference version (v2 appeared in ICLR): We proved the measure zero case for main theorem (with implications for the rates), and the multi-class case.
5 years ago by @kirk86
show all tags
deep-learning
foundations
machine-learning
optimization
readings
regularisation
stable
theory
deep-learningfoundationsmachine-learningoptimizationreadingsregularisationstabletheory
(0)
copydeleteadd this publication to your clipboard
3New insights and perspectives on the natural gradient method
J. Martens. (2014)cite arxiv:1412.1193Comment: Many small revisions/corrections throughout , Added a section on 2nd-order methods and future work.
5 years ago by @kirk86
show all tags
information
optimization
readings
theory
informationoptimizationreadingstheory
(0)
copydeleteadd this publication to your clipboard

⟨⟨
⟨
1
2
⟩
⟩⟩

bookmarks (hide)2 displayallbookmarks onlybookmarks per page5102050100 sort byadded attitle RSSBibTeXXML

publications (hide)136 displayallpublications onlypublications per page5102050100 sort byadded attitleauthorpublication dateentry typehelp for advanced sorting... RSSBibTeXRDFmore...

browse

related tags

tags

bookmarks (hide)2
display
all
bookmarks only
bookmarks per page
5
10
20
50
100
sort by
added at
title
RSS
BibTeX
XML

publications (hide)136
display
all
publications only
publications per page
5
10
20
50
100
sort by
added at
title
author
publication date
entry type
help for advanced sorting...
RSS
BibTeX
RDF
more...