Abstract
We propose Nested LSTMs (NLSTM), a novel RNN architecture with multiple
levels of memory. Nested LSTMs add depth to LSTMs via nesting as opposed to
stacking. The value of a memory cell in an NLSTM is computed by an LSTM cell,
which has its own inner memory cell. Specifically, instead of computing the
value of the (outer) memory cell as $c^outer_t = f_t c_t-1 + i_t
g_t$, NLSTM memory cells use the concatenation $(f_t c_t-1, i_t
g_t)$ as input to an inner LSTM (or NLSTM) memory cell, and set
$c^outer_t$ = $h^inner_t$. Nested LSTMs outperform both stacked and
single-layer LSTMs with similar numbers of parameters in our experiments on
various character-level language modeling tasks, and the inner memories of an
LSTM learn longer term dependencies compared with the higher-level units of a
stacked LSTM.
Users
Please
log in to take part in the discussion (add own reviews or comments).