@m-toman

Mixed Excitation for HMM-based Speech Synthesis

, , , , and . Proceedings of the 7th European Conference on Speech Communication and Technology (EUROSPEECH/INTERSPEECH), page 2263-2266. Aalborg, Denmark, (September 2001)

Abstract

This paper describes improvements on the excitation model of an HMM-based text-to-speech system. In our previous work, natural spectral and pitch parameters have been generated from HMM by using a speech parameter generation algorithm. However, synthesized speech has a typical quality of ``vocoded speech'' since the system used a traditional excitation model with either a periodic impulse train or white noise. In this paper, in order to reduce the synthetic quality, a mixed excitation model used in MELP is incorporated into the system. Excitation parameters used in mixed excitation are modeled by HMMs, and generated from HMMs by a parameter generation algorithm in the synthesis phase. The result of a listening test shows that the mixed excitation model significantly improves quality of synthesized speech as compared with the traditional excitation model.

Links and resources

Tags

community

  • @m-toman
  • @dblp
@m-toman's tags highlighted