Techreport,

Comparing LSTM Recurrent Networks and Spiking Recurrent Networks on the Recognition of Spoken Digits

A. Graves, D. Eck, and J. Schmidhuber.
IDSIA-13-03. IDSIA, www.idsia.ch/\-techrep.html, (May 2003)

Abstract

One advantage of spiking recurrent neural networks (SNNs) is an ability to categorise data using a synchrony-based latching mechnanism. This is particularly useful in problems where timewarping is encountered, such as speech recognition. Differentiable recurrent neural networks (RNNs) by contrast fail at tasks involving difficult timewarping, despite having sequence learning capabilities superior to SNNs. In this paper we demonstrate that Long Short-Term Memory (LSTM) is an RNN capable of robustly categorizing timewarped speech data, thus combining the most useful features of both paradigms. We compare its performance to SNNs on two variants of a spoken digit identification task, using data from an international competition. The first task (described in Nature (Nadis 2003)) required the categorisation of spoken digits with only a single training exemplar, and was specifically designed to test robustness to timewarping. Here LSTM performed better than all the SNNs in the competition. The second task was to predict spoken digits using a larger training set. Here LSTM greatly outperformed an SNN-like model found in the literature. These results suggest that LSTM has a place in domains that require the learning of large timewarped datasets, such as automatic speech recognition.

BibTeX key: Graves2003
entry type: techreport
address: www.idsia.ch/\-techrep.html
year: 2003
month: May
institution: IDSIA
number: IDSIA-13-03
timestamp: 2009.04.18
source: OwnPublication
file: IDSIA-13-03.ps.gz:ftp\://ftp.idsia.ch/pub/techrep/IDSIA-13-03.ps.gz:PostScript
owner: thierry

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

@techreport{Graves2003, abstract = {One advantage of spiking recurrent neural networks (SNNs) is an ability to categorise data using a synchrony-based latching mechnanism. This is particularly useful in problems where timewarping is encountered, such as speech recognition. Differentiable recurrent neural networks (RNNs) by contrast fail at tasks involving difficult timewarping, despite having sequence learning capabilities superior to SNNs. In this paper we demonstrate that Long Short-Term Memory (LSTM) is an RNN capable of robustly categorizing timewarped speech data, thus combining the most useful features of both paradigms. We compare its performance to SNNs on two variants of a spoken digit identification task, using data from an international competition. The first task (described in Nature (Nadis 2003)) required the categorisation of spoken digits with only a single training exemplar, and was specifically designed to test robustness to timewarping. Here LSTM performed better than all the SNNs in the competition. The second task was to predict spoken digits using a larger training set. Here LSTM greatly outperformed an SNN-like model found in the literature. These results suggest that LSTM has a place in domains that require the learning of large timewarped datasets, such as automatic speech recognition.}, added-at = {2010-02-27T01:05:18.000+0100}, address = {www.idsia.ch/\-techrep.html}, author = {Graves, A. and Eck, D. and Schmidhuber, J.}, biburl = {https://www.bibsonomy.org/bibtex/27bb87ec037db5ab8604bd596c6375cee/tb2332}, file = {IDSIA-13-03.ps.gz:ftp\://ftp.idsia.ch/pub/techrep/IDSIA-13-03.ps.gz:PostScript}, institution = {IDSIA}, interhash = {28a79c07e903403eef33e13094fc17bc}, intrahash = {7bb87ec037db5ab8604bd596c6375cee}, keywords = {imported}, month = May, number = {IDSIA-13-03}, owner = {thierry}, source = {OwnPublication}, timestamp = {2010-02-27T01:05:22.000+0100}, title = {Comparing {LSTM} Recurrent Networks and Spiking Recurrent Networks on the Recognition of Spoken Digits}, year = 2003 }

BibSonomy

Comparing LSTM Recurrent Networks and Spiking Recurrent Networks on the Recognition of Spoken Digits

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on