Deep Learning Works in Practice. But Does it Work in Theory?
L. Hoang, and R. Guerraoui. (2018)cite arxiv:1801.10437Comment: 6 pages, 4 figures.
Abstract
Deep learning relies on a very specific kind of neural networks: those
superposing several neural layers. In the last few years, deep learning
achieved major breakthroughs in many tasks such as image analysis, speech
recognition, natural language processing, and so on. Yet, there is no
theoretical explanation of this success. In particular, it is not clear why the
deeper the network, the better it actually performs.
We argue that the explanation is intimately connected to a key feature of the
data collected from our surrounding universe to feed the machine learning
algorithms: large non-parallelizable logical depth. Roughly speaking, we
conjecture that the shortest computational descriptions of the universe are
algorithms with inherently large computation times, even when a large number of
computers are available for parallelization. Interestingly, this conjecture,
combined with the folklore conjecture in theoretical computer science that $ P
NC$, explains the success of deep learning.
%0 Generic
%1 hoang2018learning
%A Hoang, Lê Nguyên
%A Guerraoui, Rachid
%D 2018
%K overview to_read
%T Deep Learning Works in Practice. But Does it Work in Theory?
%U http://arxiv.org/abs/1801.10437
%X Deep learning relies on a very specific kind of neural networks: those
superposing several neural layers. In the last few years, deep learning
achieved major breakthroughs in many tasks such as image analysis, speech
recognition, natural language processing, and so on. Yet, there is no
theoretical explanation of this success. In particular, it is not clear why the
deeper the network, the better it actually performs.
We argue that the explanation is intimately connected to a key feature of the
data collected from our surrounding universe to feed the machine learning
algorithms: large non-parallelizable logical depth. Roughly speaking, we
conjecture that the shortest computational descriptions of the universe are
algorithms with inherently large computation times, even when a large number of
computers are available for parallelization. Interestingly, this conjecture,
combined with the folklore conjecture in theoretical computer science that $ P
NC$, explains the success of deep learning.
@misc{hoang2018learning,
abstract = {Deep learning relies on a very specific kind of neural networks: those
superposing several neural layers. In the last few years, deep learning
achieved major breakthroughs in many tasks such as image analysis, speech
recognition, natural language processing, and so on. Yet, there is no
theoretical explanation of this success. In particular, it is not clear why the
deeper the network, the better it actually performs.
We argue that the explanation is intimately connected to a key feature of the
data collected from our surrounding universe to feed the machine learning
algorithms: large non-parallelizable logical depth. Roughly speaking, we
conjecture that the shortest computational descriptions of the universe are
algorithms with inherently large computation times, even when a large number of
computers are available for parallelization. Interestingly, this conjecture,
combined with the folklore conjecture in theoretical computer science that $ P
\neq NC$, explains the success of deep learning.},
added-at = {2018-02-10T13:33:14.000+0100},
author = {Hoang, Lê Nguyên and Guerraoui, Rachid},
biburl = {https://www.bibsonomy.org/bibtex/2e7c00b633684a7ea13582dfd934f61b6/jk_itwm},
description = {1801.10437.pdf},
interhash = {c9a190d58e54ebd9bf41296dd7e79819},
intrahash = {e7c00b633684a7ea13582dfd934f61b6},
keywords = {overview to_read},
note = {cite arxiv:1801.10437Comment: 6 pages, 4 figures},
timestamp = {2018-02-10T13:33:14.000+0100},
title = {Deep Learning Works in Practice. But Does it Work in Theory?},
url = {http://arxiv.org/abs/1801.10437},
year = 2018
}