Abstract
The convergence of back-propagation learning is
analyzed so as to explain common phenomenon observed by
practitioners. Many undesirable behaviors of backprop
can be avoided with tricks that are rarely exposedin
serious technical publications. This paper gives some
of those tricks, ando.ers explanations of why they
work. Many authors have suggested that second-order
optimization methods are advantageous for neural net
training. It is shown that most â classicalâ
second-order methods are impractical for large neural
networks. A few methods are proposed that do not have
these limitations.
Users
Please
log in to take part in the discussion (add own reviews or comments).