Abstract
Unlike traditional recurrent neural networks, the Long Short-Term
Memory (LSTM) model generalizes well when presented with training
sequences derived from regular and also simple nonregular languages.
Our novel combination of LSTM and the decoupled extended Kalman filter,
however, learns even faster and generalizes even better, requiring
only the 10 shortest exemplars n <= 10 of the context sensitive language
a^nb^nc^n to deal correctly with values of n up to 1000 and more.
Even when we consider the relatively high update complexity per timestep,
in many cases the hybrid offers faster learning than LSTM by itself.
Users
Please
log in to take part in the discussion (add own reviews or comments).