A key appeal of the recently proposed Neural Ordinary Differential
Equation(ODE) framework is that it seems to provide a continuous-time extension
of discrete residual neural networks. As we show herein, though, trained Neural
ODE models actually depend on the specific numerical method used during
training. If the trained model is supposed to be a flow generated from an ODE,
it should be possible to choose another numerical solver with equal or smaller
numerical error without loss of performance. We observe that if training relies
on a solver with overly coarse discretization, then testing with another solver
of equal or smaller numerical error results in a sharp drop in accuracy. In
such cases, the combination of vector field and numerical method cannot be
interpreted as a flow generated from an ODE, which arguably poses a fatal
breakdown of the Neural ODE concept. We observe, however, that there exists a
critical step size beyond which the training yields a valid ODE vector field.
We propose a method that monitors the behavior of the ODE solver during
training to adapt its step size, aiming to ensure a valid ODE without
unnecessarily increasing computational cost. We verify this adaption algorithm
on two common bench mark datasets as well as a synthetic dataset. Furthermore,
we introduce a novel synthetic dataset in which the underlying ODE directly
generates a classification task.
Описание
[2007.15386] When are Neural ODE Solutions Proper ODEs?
%0 Generic
%1 ott2020neural
%A Ott, Katharina
%A Katiyar, Prateek
%A Hennig, Philipp
%A Tiemann, Michael
%D 2020
%K deep-learning from:adulny neural-ode ode solver step-size
%T When are Neural ODE Solutions Proper ODEs?
%U http://arxiv.org/abs/2007.15386
%X A key appeal of the recently proposed Neural Ordinary Differential
Equation(ODE) framework is that it seems to provide a continuous-time extension
of discrete residual neural networks. As we show herein, though, trained Neural
ODE models actually depend on the specific numerical method used during
training. If the trained model is supposed to be a flow generated from an ODE,
it should be possible to choose another numerical solver with equal or smaller
numerical error without loss of performance. We observe that if training relies
on a solver with overly coarse discretization, then testing with another solver
of equal or smaller numerical error results in a sharp drop in accuracy. In
such cases, the combination of vector field and numerical method cannot be
interpreted as a flow generated from an ODE, which arguably poses a fatal
breakdown of the Neural ODE concept. We observe, however, that there exists a
critical step size beyond which the training yields a valid ODE vector field.
We propose a method that monitors the behavior of the ODE solver during
training to adapt its step size, aiming to ensure a valid ODE without
unnecessarily increasing computational cost. We verify this adaption algorithm
on two common bench mark datasets as well as a synthetic dataset. Furthermore,
we introduce a novel synthetic dataset in which the underlying ODE directly
generates a classification task.
@misc{ott2020neural,
abstract = {A key appeal of the recently proposed Neural Ordinary Differential
Equation(ODE) framework is that it seems to provide a continuous-time extension
of discrete residual neural networks. As we show herein, though, trained Neural
ODE models actually depend on the specific numerical method used during
training. If the trained model is supposed to be a flow generated from an ODE,
it should be possible to choose another numerical solver with equal or smaller
numerical error without loss of performance. We observe that if training relies
on a solver with overly coarse discretization, then testing with another solver
of equal or smaller numerical error results in a sharp drop in accuracy. In
such cases, the combination of vector field and numerical method cannot be
interpreted as a flow generated from an ODE, which arguably poses a fatal
breakdown of the Neural ODE concept. We observe, however, that there exists a
critical step size beyond which the training yields a valid ODE vector field.
We propose a method that monitors the behavior of the ODE solver during
training to adapt its step size, aiming to ensure a valid ODE without
unnecessarily increasing computational cost. We verify this adaption algorithm
on two common bench mark datasets as well as a synthetic dataset. Furthermore,
we introduce a novel synthetic dataset in which the underlying ODE directly
generates a classification task.},
added-at = {2021-06-28T16:06:42.000+0200},
author = {Ott, Katharina and Katiyar, Prateek and Hennig, Philipp and Tiemann, Michael},
biburl = {https://www.bibsonomy.org/bibtex/2c0d27e104834572ac77d15a43e886f26/adulny},
description = {[2007.15386] When are Neural ODE Solutions Proper ODEs?},
interhash = {5493223ed99181cbd0e0e7807492441e},
intrahash = {c0d27e104834572ac77d15a43e886f26},
keywords = {deep-learning from:adulny neural-ode ode solver step-size},
note = {cite arxiv:2007.15386},
timestamp = {2021-06-28T16:06:42.000+0200},
title = {When are Neural ODE Solutions Proper ODEs?},
url = {http://arxiv.org/abs/2007.15386},
year = 2020
}