Abstract
This paper presents an investigation of the approximation property of neural
networks with unbounded activation functions, such as the rectified linear unit
(ReLU), which is the new de-facto standard of deep learning. The ReLU network
can be analyzed by the ridgelet transform with respect to Lizorkin
distributions. By showing three reconstruction formulas by using the Fourier
slice theorem, the Radon transform, and Parseval's relation, it is shown that a
neural network with unbounded activation functions still satisfies the
universal approximation property. As an additional consequence, the ridgelet
transform, or the backprojection filter in the Radon domain, is what the
network learns after backpropagation. Subject to a constructive admissibility
condition, the trained network can be obtained by simply discretizing the
ridgelet transform, without backpropagation. Numerical examples not only
support the consistency of the admissibility condition but also imply that some
non-admissible cases result in low-pass filtering.
Users
Please
log in to take part in the discussion (add own reviews or comments).