Learning Functions: When Is Deep Better Than Shallow

Abstract

While the universal approximation property holds both for hierarchical and shallow networks, we prove that deep (hierarchical) networks can approximate the class of compositional functions with the same accuracy as shallow networks but with exponentially lower number of training parameters as well as VC-dimension. This theorem settles an old conjecture by Bengio on the role of depth in networks. We then define a general class of scalable, shift-invariant algorithms to show a simple and natural set of requirements that justify deep convolutional networks.

BibTeX key: mhaskar2016learning
entry type: article
year: 2016
url: http://arxiv.org/abs/1603.00988
note: cite arxiv:1603.00988

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

BibSonomy

Learning Functions: When Is Deep Better Than Shallow

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on