Article,

Updating the Silent Speech Challenge benchmark with deep learning

Y. Ji, L. Liu, H. Wang, Z. Liu, Z. Niu, and B. Denby.
Speech Commun., (April 2018)

Abstract

The term ``Silent Speech Interface'' was introduced almost a decade ago to describe speech communication systems using only non-acoustic sensors, such as electromyography, ultrasound tongue imaging, or electromagnetic articulography. Although the use of specialized sensors in speech processing is challenging, silent speech research remains an active field that can often profit from new developments in traditional acoustic speech processing -- for example recent advances in Deep Learning. After an overview of Silent Speech Interfaces and their special challenges, the article presents new results in which a 2010 benchmark study, called the Silent Speech Challenge, is updated with a Deep Learning strategy, using the same input features and decoding strategy as in the original Challenge article. A Word Error Rate of 6.4\% is obtained with the new method, compared to the published benchmark value of 17.4\%. Additional results comparing new auto-encoder-based features with the original features at reduced dimensionality, as well as decoding scenarios on two different language models, are also presented. The Silent Speech Challenge archive has furthermore been updated to contain both the original and the new auto-encoder features, in addition to the original raw data.

BibTeX key: Ji2018-lv
entry type: article
year: 2018
month: apr
journal: Speech Commun.
pages: 42--50
volume: 98

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

@article{Ji2018-lv, abstract = {The term ``Silent Speech Interface'' was introduced almost a decade ago to describe speech communication systems using only non-acoustic sensors, such as electromyography, ultrasound tongue imaging, or electromagnetic articulography. Although the use of specialized sensors in speech processing is challenging, silent speech research remains an active field that can often profit from new developments in traditional acoustic speech processing -- for example recent advances in Deep Learning. After an overview of Silent Speech Interfaces and their special challenges, the article presents new results in which a 2010 benchmark study, called the Silent Speech Challenge, is updated with a Deep Learning strategy, using the same input features and decoding strategy as in the original Challenge article. A Word Error Rate of 6.4\% is obtained with the new method, compared to the published benchmark value of 17.4\%. Additional results comparing new auto-encoder-based features with the original features at reduced dimensionality, as well as decoding scenarios on two different language models, are also presented. The Silent Speech Challenge archive has furthermore been updated to contain both the original and the new auto-encoder features, in addition to the original raw data.}, added-at = {2023-12-13T06:12:59.000+0100}, author = {Ji, Yan and Liu, Licheng and Wang, Hongcui and Liu, Zhilei and Niu, Zhibin and Denby, Bruce}, biburl = {https://www.bibsonomy.org/bibtex/25765f1b0b2b6455e465e2db929ee99ee/admin}, interhash = {ab7a3a4bac5f5086101ddfc9958de1bb}, intrahash = {5765f1b0b2b6455e465e2db929ee99ee}, journal = {Speech Commun.}, keywords = {}, month = apr, pages = {42--50}, timestamp = {2023-12-13T06:12:59.000+0100}, title = {Updating the Silent Speech Challenge benchmark with deep learning}, volume = 98, year = 2018 }

BibSonomy

Updating the Silent Speech Challenge benchmark with deep learning

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on