Inproceedings,

SottoVoce: An Ultrasound Imaging-Based Silent Speech Interaction Using Deep Neural Networks

N. Kimura, M. Kono, and J. Rekimoto.
Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, Paper 146, page 1--11. New York, NY, USA, Association for Computing Machinery, (May 2019)

Abstract

The availability of digital devices operated by voice is expanding rapidly. However, the applications of voice interfaces are still restricted. For example, speaking in public places becomes an annoyance to the surrounding people, and secret information should not be uttered. Environmental noise may reduce the accuracy of speech recognition. To address these limitations, a system to detect a user's unvoiced utterance is proposed. From internal information observed by an ultrasonic imaging sensor attached to the underside of the jaw, our proposed system recognizes the utterance contents without the user's uttering voice. Our proposed deep neural network model is used to obtain acoustic features from a sequence of ultrasound images. We confirmed that audio signals generated by our system can control the existing smart speakers. We also observed that a user can adjust their oral movement to learn and improve the accuracy of their voice recognition.

BibTeX key: Kimura2019-tx
entry type: inproceedings
address: New York, NY, USA
booktitle: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems
year: 2019
month: may
number: Paper 146
pages: 1--11
publisher: Association for Computing Machinery
series: CHI '19
location: Glasgow, Scotland Uk

BibSonomy

SottoVoce: An Ultrasound Imaging-Based Silent Speech Interaction Using Deep Neural Networks

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on