copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Machine Learning Paradigms for Speech Recognition: An Overview

L. Deng, and X. Li. Audio, Speech, and Language Processing, IEEE Transactions on, 21 (5): 1060--1089 (May 2013)
DOI: 10.1109/TASL.2013.2244083

Abstract

Automatic Speech Recognition (ASR) has historically been a driving force behind many machine learning (ML) techniques, including the ubiquitously used hidden Markov model, discriminative learning, structured sequence learning, Bayesian learning, and adaptive learning. Moreover, ML can and occasionally does use ASR as a large-scale, realistic application to rigorously test the effectiveness of a given technique, and to inspire new problems arising from the inherently sequential and dynamic nature of speech. On the other hand, even though ASR is available commercially for some applications, it is largely an unsolved problem - for almost all applications, the performance of ASR is not on par with human performance. New insight from modern ML methodology shows great promise to advance the state-of-the-art in ASR technology. This overview article provides readers with an overview of modern ML techniques as utilized in the current and as relevant to future ASR research and systems. The intent is to foster further cross-pollination between the ML and ASR communities than has occurred in the past. The article is organized according to the major ML paradigms that are either popular already or have potential for making significant contributions to ASR technology. The paradigms presented and elaborated in this overview include: generative and discriminative learning; supervised, unsupervised, semi-supervised, and active learning; adaptive and multi-task learning; and Bayesian learning. These learning paradigms are motivated and discussed in the context of ASR technology and applications. We finally present and analyze recent developments of deep learning and learning with sparse representations, focusing on their direct relevance to advancing ASR technology.

Links and resources

BibTeX key: deng-paradigms-speech-recognition-2013
entry type: article
year: 2013
month: may
journal: Audio, Speech, and Language Processing, IEEE Transactions on
number: 5
pages: 1060--1089
volume: 21
issn: 1558-7916
DOI: 10.1109/TASL.2013.2244083
url: http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6423821

@mhwombat's tags highlighted

speech_recognition

Cite this publication

%0 Journal Article %1 deng-paradigms-speech-recognition-2013 %A Deng, Li %A Li, Xiao %D 2013 %J Audio, Speech, and Language Processing, IEEE Transactions on %K speech_recognition %N 5 %P 1060--1089 %R 10.1109/TASL.2013.2244083 %T Machine Learning Paradigms for Speech Recognition: An Overview %U http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6423821 %V 21 %X Automatic Speech Recognition (ASR) has historically been a driving force behind many machine learning (ML) techniques, including the ubiquitously used hidden Markov model, discriminative learning, structured sequence learning, Bayesian learning, and adaptive learning. Moreover, ML can and occasionally does use ASR as a large-scale, realistic application to rigorously test the effectiveness of a given technique, and to inspire new problems arising from the inherently sequential and dynamic nature of speech. On the other hand, even though ASR is available commercially for some applications, it is largely an unsolved problem - for almost all applications, the performance of ASR is not on par with human performance. New insight from modern ML methodology shows great promise to advance the state-of-the-art in ASR technology. This overview article provides readers with an overview of modern ML techniques as utilized in the current and as relevant to future ASR research and systems. The intent is to foster further cross-pollination between the ML and ASR communities than has occurred in the past. The article is organized according to the major ML paradigms that are either popular already or have potential for making significant contributions to ASR technology. The paradigms presented and elaborated in this overview include: generative and discriminative learning; supervised, unsupervised, semi-supervised, and active learning; adaptive and multi-task learning; and Bayesian learning. These learning paradigms are motivated and discussed in the context of ASR technology and applications. We finally present and analyze recent developments of deep learning and learning with sparse representations, focusing on their direct relevance to advancing ASR technology.

@article{deng-paradigms-speech-recognition-2013, abstract = {Automatic Speech Recognition (ASR) has historically been a driving force behind many machine learning (ML) techniques, including the ubiquitously used hidden Markov model, discriminative learning, structured sequence learning, Bayesian learning, and adaptive learning. Moreover, ML can and occasionally does use ASR as a large-scale, realistic application to rigorously test the effectiveness of a given technique, and to inspire new problems arising from the inherently sequential and dynamic nature of speech. On the other hand, even though ASR is available commercially for some applications, it is largely an unsolved problem - for almost all applications, the performance of ASR is not on par with human performance. New insight from modern ML methodology shows great promise to advance the state-of-the-art in ASR technology. This overview article provides readers with an overview of modern ML techniques as utilized in the current and as relevant to future ASR research and systems. The intent is to foster further cross-pollination between the ML and ASR communities than has occurred in the past. The article is organized according to the major ML paradigms that are either popular already or have potential for making significant contributions to ASR technology. The paradigms presented and elaborated in this overview include: generative and discriminative learning; supervised, unsupervised, semi-supervised, and active learning; adaptive and multi-task learning; and Bayesian learning. These learning paradigms are motivated and discussed in the context of ASR technology and applications. We finally present and analyze recent developments of deep learning and learning with sparse representations, focusing on their direct relevance to advancing ASR technology.}, added-at = {2015-11-17T13:15:54.000+0100}, author = {Deng, Li and Li, Xiao}, biburl = {https://www.bibsonomy.org/bibtex/2ea0300419777ceee425294cf01c8e84f/mhwombat}, doi = {10.1109/TASL.2013.2244083}, interhash = {f536d9bba87869e362dffdf8e0a913ac}, intrahash = {ea0300419777ceee425294cf01c8e84f}, issn = {1558-7916}, journal = {Audio, Speech, and Language Processing, IEEE Transactions on}, keywords = {speech_recognition}, month = may, number = 5, pages = {1060--1089}, timestamp = {2016-07-12T19:25:30.000+0200}, title = {Machine Learning Paradigms for Speech Recognition: An Overview}, url = {http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6423821}, volume = 21, year = 2013 }

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Machine Learning Paradigms for Speech Recognition: An Overview

Abstract

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Machine Learning Paradigms for Speech Recognition: An Overview

Abstract

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Machine Learning Paradigms for Speech Recognition: An Overview

Comments and Reviews
(0)