A New Approach to Continuous Speech Recognition Using LSTM Recurrent Neural Networks

Abstract

This paper presents an algorithm for continuous speech recognition built from two Long Short-Term Memory (LSTM) recurrent neural networks. A first LSTM network performs frame-level phone probability estimation. A second network maps these phone predictions onto words. In contrast to HMMs, this allows greater exploitation of long-timescale correlations. Simulation results are presented for a hand-segmented subset of the "Numbers-95" database. These results include isolated phone prediction, continuous frame-level phone prediction and continuous word prediction. We conclude that despite its early stage of development, our new model is already competitive with existing approaches on certain aspects of speech recognition and promising on others, warranting further research.

BibTeX key: Eck2003
entry type: techreport
address: www.idsia.ch/\-techrep.html
year: 2003
month: May
institution: IDSIA
number: IDSIA-14-03
timestamp: 2009.04.18
source: OwnPublication
file: IDSIA-14-03.ps.gz:ftp\://ftp.idsia.ch/pub/techrep/IDSIA-14-03.ps.gz:PostScript
owner: thierry

BibSonomy

A New Approach to Continuous Speech Recognition Using LSTM Recurrent Neural Networks

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on