Filter Bank Energy Based Malayalam Speech Segmentation and Recognition
P. K.P, and S. Idiculla. International Journal of Recent Trends in Human Computer Interaction (IJHCI)8
1-7 (February 2017)
Even though speech recognition technologies have made substantial progress, LVSR and vocabulary independent systems have not yet attained sufficient accuracy levels. For vocabulary independent speech recognition systems, segmentation of speech signal in to its constituent units such as phonemes, syllables is necessary. This paper presents a method of segmentation of spoken Malayalam words in to its constituent syllables and analyses the classification accuracy using PNN and HMM. Variations in peak filter bank energy is used for modeling criteria for segmentation. Mel Frequency Cepstral Coefficients (MFCC) and energy in each frame is used to extract the resultant feature vector in the feature extraction stage. A semi-automatic method is used for labeling the speech segments in the training phase. The system is trained using 30 samples of 26 syllables semi automatically segmented from fifty words collected from a male and female and tested on another set of fifty words containing 4720 syllables gives maximum accuracy of 74.7% and 66.77% for male and female respectively.