Environment and Sensor Robustness in Automatic Speech Recognition
U. Bhattacharjee. International Journal of Innovative Science and Modern Engineering (IJISME), 1 (2):
31-37(January 2013)
Abstract
Most of the presently available speech recognition systems work efficiently only in some ideal conditions. This is due to the fact that these systems are based on some assumptions related to the operating conditions. The system works efficiently if the actual working environment is identical with the environment for which the system is built. Performance of the speech recognition system considerably degrades if mismatch between the training and the testing environment occurs. In the present study, mismatch due to sensor variability and environment has been considered and Cepstral Mean Normalization (CMN) and Spectral subtraction methods have been investigated as front-end methods for the reduction of noise. A Hidden Markov Model (HMM) based speech recognition system has been built with Mel-Frequency Cepstral Coefficient (MFCC) as feature vector. It has been observed that there is a 15% enhancement of system performance in channel and environment mismatched condition compared to baseline performance when CMN and spectral subtraction methods have been applied for noise reduction.
%0 Journal Article
%1 Foltz.2002
%A Bhattacharjee, Utpal
%D 2013
%E Kumar, Dr. Shiv
%J International Journal of Innovative Science and Modern Engineering (IJISME)
%K CMN MFCC Recognition Robust Spectral Speech Subtraction
%N 2
%P 31-37
%T Environment and Sensor Robustness in Automatic Speech Recognition
%U https://www.ijisme.org/wp-content/uploads/papers/v1i2/B0128011213.pdf
%V 1
%X Most of the presently available speech recognition systems work efficiently only in some ideal conditions. This is due to the fact that these systems are based on some assumptions related to the operating conditions. The system works efficiently if the actual working environment is identical with the environment for which the system is built. Performance of the speech recognition system considerably degrades if mismatch between the training and the testing environment occurs. In the present study, mismatch due to sensor variability and environment has been considered and Cepstral Mean Normalization (CMN) and Spectral subtraction methods have been investigated as front-end methods for the reduction of noise. A Hidden Markov Model (HMM) based speech recognition system has been built with Mel-Frequency Cepstral Coefficient (MFCC) as feature vector. It has been observed that there is a 15% enhancement of system performance in channel and environment mismatched condition compared to baseline performance when CMN and spectral subtraction methods have been applied for noise reduction.
@article{Foltz.2002,
abstract = {Most of the presently available speech recognition systems work efficiently only in some ideal conditions. This is due to the fact that these systems are based on some assumptions related to the operating conditions. The system works efficiently if the actual working environment is identical with the environment for which the system is built. Performance of the speech recognition system considerably degrades if mismatch between the training and the testing environment occurs. In the present study, mismatch due to sensor variability and environment has been considered and Cepstral Mean Normalization (CMN) and Spectral subtraction methods have been investigated as front-end methods for the reduction of noise. A Hidden Markov Model (HMM) based speech recognition system has been built with Mel-Frequency Cepstral Coefficient (MFCC) as feature vector. It has been observed that there is a 15% enhancement of system performance in channel and environment mismatched condition compared to baseline performance when CMN and spectral subtraction methods have been applied for noise reduction.},
added-at = {2021-09-22T10:53:58.000+0200},
author = {Bhattacharjee, Utpal},
biburl = {https://www.bibsonomy.org/bibtex/2b1e5720593b2368e47996a324b21b1c3/ijisme_beiesp},
editor = {Kumar, Dr. Shiv},
interhash = {7ef41a142ae5ed321577b56067948ff8},
intrahash = {b1e5720593b2368e47996a324b21b1c3},
issn = {2319-6386},
journal = {International Journal of Innovative Science and Modern Engineering (IJISME)},
keywords = {CMN MFCC Recognition Robust Spectral Speech Subtraction},
language = {En},
month = {January},
number = 2,
pages = {31-37},
timestamp = {2021-09-22T10:53:58.000+0200},
title = {Environment and Sensor Robustness in Automatic Speech Recognition},
url = {https://www.ijisme.org/wp-content/uploads/papers/v1i2/B0128011213.pdf},
volume = 1,
year = 2013
}