A Comparison Between Spiking and Differentiable Recurrent Neural Networks on Spoken Digit Recognition

Abstract

In this paper we demonstrate that Long Short-Term Memory (LSTM) is a differentiable recurrent neural net (RNN) capable of robustly categorizing timewarped speech data. We measure its performance on a spoken digit identification task, where the data was spike-encoded in such a way that classifying the utterances became a difficult challenge in non-linear timewarping. We find that LSTM gives greatly superior results to an SNN found in the literature, and conclude that the architecture has a place in domains that require the learning of large timewarped datasets, such as automatic speech recognition.

BibTeX key: graves+beringer+schmidhuber:2003
entry type: inproceedings
address: Grindelwald
booktitle: The 23rd IASTED International Conference on modelling, identification, and control
year: 2003
priority: 2
citeulike-article-id: 2381716

BibSonomy

A Comparison Between Spiking and Differentiable Recurrent Neural Networks on Spoken Digit Recognition

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on