Article,

Speaker interpolation for HMM-based speech synthesis system

T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura.
Journal of the Acoustical Science of Japan (E), 21 (4): 199-206 (July 2000)
DOI: http://dx.doi.org/10.1250/ast.21.199

Abstract

This paper describes an approach to voice characteristics conversion for an HMM-based text-to-speech synthesis system using speaker interpolation.Although most text-to-speech synthesis systems which synthesize speech by concatenating speech units can synthesize speech with acceptable quality, they still cannot synthesize speech with various voice quality such as speaker individualities and emotions;In order to control speaker individualities and emotions, therefore, they need a large database, which records speech units with various voice characteristics in sythesis phase.On the other hand, our system synthesize speech with untrained speaker’s voice quality by interpolating HMM parameters among some representative speakers’ HMM sets.Accordingly, our system can synthesize speech with various voice quality without large database in synthesis phase.An HMM interpolation technique is derived from a probabilistic similarity measure for HMMs, and used to synthesize speech with untrained speaker’s voice quality by interpolating HMM parameters among some representative speakers’ HMM sets.The results of subjective experiments show that we can gradually change the voice quality of synthesized speech from one’s to the other’s by changing the interpolation ratio.

BibTeX key: Yoshimura2000
entry type: article
year: 2000
month: jul
journal: Journal of the Acoustical Science of Japan (E)
number: 4
pages: 199-206
volume: 21
owner: schabus
file: :pdfs/yoshimura_acoustical_2000.pdf:PDF
DOI: http://dx.doi.org/10.1250/ast.21.199

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

@article{Yoshimura2000, abstract = {This paper describes an approach to voice characteristics conversion for an HMM-based text-to-speech synthesis system using speaker interpolation.Although most text-to-speech synthesis systems which synthesize speech by concatenating speech units can synthesize speech with acceptable quality, they still cannot synthesize speech with various voice quality such as speaker individualities and emotions;In order to control speaker individualities and emotions, therefore, they need a large database, which records speech units with various voice characteristics in sythesis phase.On the other hand, our system synthesize speech with untrained speaker’s voice quality by interpolating HMM parameters among some representative speakers’ HMM sets.Accordingly, our system can synthesize speech with various voice quality without large database in synthesis phase.An HMM interpolation technique is derived from a probabilistic similarity measure for HMMs, and used to synthesize speech with untrained speaker’s voice quality by interpolating HMM parameters among some representative speakers’ HMM sets.The results of subjective experiments show that we can gradually change the voice quality of synthesized speech from one’s to the other’s by changing the interpolation ratio.}, added-at = {2021-02-01T10:51:23.000+0100}, author = {Yoshimura, Takayoshi and Tokuda, Keiichi and Masuko, Takashi and Kobayashi, Takao and Kitamura, Tadashi}, biburl = {https://www.bibsonomy.org/bibtex/2d034d76548aec95cfb175dae0acc3ae5/m-toman}, doi = {http://dx.doi.org/10.1250/ast.21.199}, file = {:pdfs/yoshimura_acoustical_2000.pdf:PDF}, interhash = {cf1e4b667a4047d585f6d6406ade62f0}, intrahash = {d034d76548aec95cfb175dae0acc3ae5}, journal = {Journal of the Acoustical Science of Japan (E)}, keywords = {imported}, month = jul, number = 4, owner = {schabus}, pages = {199-206}, timestamp = {2021-02-01T10:51:23.000+0100}, title = {Speaker interpolation for {HMM}-based speech synthesis system}, volume = 21, year = 2000 }

BibSonomy

Speaker interpolation for HMM-based speech synthesis system

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on