Learning Context Sensitive Languages with LSTM Trained with Kalman Filters

Abstract

Unlike traditional recurrent neural networks, the Long Short-Term Memory (LSTM) model generalizes well when presented with training sequences derived from regular and also simple nonregular languages. Our novel combination of LSTM and the decoupled extended Kalman filter, however, learns even faster and generalizes even better, requiring only the 10 shortest exemplars n <= 10 of the context sensitive language a^nb^nc^n to deal correctly with values of n up to 1000 and more. Even when we consider the relatively high update complexity per timestep, in many cases the hybrid offers faster learning than LSTM by itself.

BibTeX key: Gers2002
entry type: inproceedings
address: Berlin
booktitle: Artificial Neural Networks -- ICANN 2002 (Proceedings)
year: 2002
pages: 655--660
publisher: Springer
timestamp: 2009.04.18
source: OwnPublication
owner: thierry

BibSonomy

Learning Context Sensitive Languages with LSTM Trained with Kalman Filters

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on