Nested LSTMs

Abstract

We propose Nested LSTMs (NLSTM), a novel RNN architecture with multiple levels of memory. Nested LSTMs add depth to LSTMs via nesting as opposed to stacking. The value of a memory cell in an NLSTM is computed by an LSTM cell, which has its own inner memory cell. Specifically, instead of computing the value of the (outer) memory cell as $c^outer_t = f_t c_t-1 + i_t g_t$, NLSTM memory cells use the concatenation $(f_t c_t-1, i_t g_t)$ as input to an inner LSTM (or NLSTM) memory cell, and set $c^outer_t$ = $h^inner_t$. Nested LSTMs outperform both stacked and single-layer LSTMs with similar numbers of parameters in our experiments on various character-level language modeling tasks, and the inner memories of an LSTM learn longer term dependencies compared with the higher-level units of a stacked LSTM.

BibTeX key: moniz2018nested
entry type: article
year: 2018
url: http://arxiv.org/abs/1801.10308
note: cite arxiv:1801.10308Comment: Accepted at ACML 2017

BibSonomy

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on