@m-toman

Restructuring speech representations using a pitch-adaptive time–frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds

, , and . Speech Communication, 27 (3–4): 187-207 (1999)
DOI: 10.1016/S0167-6393(98)00085-5

Abstract

A set of simple new procedures has been developed to enable the real-time manipulation of speech parameters. The proposed method uses pitch-adaptive spectral analysis combined with a surface reconstruction method in the time–frequency region. The method also consists of a fundamental frequency (F0) extraction using instantaneous frequency calculation based on a new concept called `fundamentalness'. The proposed procedures preserve the details of time–frequency surfaces while almost perfectly removing fine structures due to signal periodicity. This close-to-perfect elimination of interferences and smooth F0 trajectory allow for over 600\% manipulation of such speech parameters as pitch, vocal tract length, and speaking rate, while maintaining high reproductive quality.

Links and resources

Tags

community

  • @m-toman
  • @dblp
@m-toman's tags highlighted