Multiplicative LSTM for sequence modelling

Abstract

This paper introduces multiplicative LSTM, a novel hybrid recurrent neural network architecture for sequence modelling that combines the long short-term memory (LSTM) and multiplicative recurrent neural network architectures. Multiplicative LSTM is motivated by its flexibility to have very different recurrent transition functions for each possible input, which we argue helps make it more expressive in autoregressive density estimation. We show empirically that multiplicative LSTM outperforms standard LSTM and its deep variants for a range of character level modelling tasks. We also found that this improvement increases as the complexity of the task scales up. This model achieves a test error of 1.19 bits/character on the last 4 million characters of the Hutter prize dataset when combined with dynamic evaluation.

BibTeX key: krause2016multiplicative
entry type: misc
year: 2016
url: http://arxiv.org/abs/1609.07959
note: cite arxiv:1609.07959

BibSonomy

Multiplicative LSTM for sequence modelling

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on