Inproceedings,

Deep Learning is not a Matter of Depth but of Good Training

, and .
International Conference on Pattern Recognition and Artificial Intelligence (ICPRAI), page 683--687. CENPARMI, Concordia University, (2018)

Abstract

In the past few years, deep neural networks have often been claimed to provide greater representational power than shallow networks. In this work, we propose a wide, shallow, and strictly sequential network architecture without any residual connections. When trained with cyclical learning rate schedules, this simple network achieves a classification accuracy on CIFAR-100 competitive to a 10 times deeper residual network, while it can be trained 4 times faster. This provides evidence that neither depth nor residual connections are crucial for deep learning. Instead, residual connections just seem to facilitate training using plain SGD by avoiding bad local minima. We believe that our work can hence point the research community to the actual bottleneck of contemporary deep learning: the optimization algorithms.

Tags

Users

  • @bjoern.barz
  • @s364315

Comments and Reviewsshow / hide

  • @s364315
    @s364315 3 years ago
    Reference for Plain11 neural network architecture
Please log in to take part in the discussion (add own reviews or comments).