Аннотация
Why do deep neural networks generalize with a very high dimensional parameter
space? We took an information theoretic approach. We find that the
dimensionality of the parameter space can be studied by singular
semi-Riemannian geometry and is upper-bounded by the sample size. We adapt
Fisher information to this singular neuromanifold. We use random matrix theory
to derive a minimum description length of a deep learning model, where the
spectrum of the Fisher information matrix plays a key role to improve
generalisation.
Пользователи данного ресурса
Пожалуйста,
войдите в систему, чтобы принять участие в дискуссии (добавить собственные рецензию, или комментарий)