copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Automatic generation of synthesis units for trainable text-to-speech systems

H. Hon, A. Acero, X. Huang, J. Liu, and M. Plumpe. Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 1, page 293-296. Seattle, WA, USA, (May 1998)
DOI: 10.1109/ICASSP.1998.674425

Abstract

The Whistler text-to-speech engine was designed so that we can automatically construct the model parameters from training data. This paper describes in detail the design issues of constructing the synthesis unit inventory automatically from speech databases. The automatic process includes (1) determining the scaleable synthesis unit which can reflect spectral variations of different allophones; (2) segmenting the recording sentences into phonetic segments; (3) select good instances for each synthesis unit to generate best synthesis sentence during the run time. These processes are all derived through the use of probabilistic learning methods which are aimed at the same optimization criteria. Through this automatic unit generation, Whistler can automatically produce synthetic speech that sounds very natural and resembles the acoustic characteristics of the original speaker

Links and resources

BibTeX key: Hon1998
entry type: inproceedings
address: Seattle, WA, USA
booktitle: Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
year: 1998
month: may
pages: 293-296
volume: 1
owner: schabus
file: :pdfs/hon_icassp_1998.pdf:PDF
issn: 1520-6149
DOI: 10.1109/ICASSP.1998.674425

Cite this publication

@inproceedings{Hon1998, abstract = {The Whistler text-to-speech engine was designed so that we can automatically construct the model parameters from training data. This paper describes in detail the design issues of constructing the synthesis unit inventory automatically from speech databases. The automatic process includes (1) determining the scaleable synthesis unit which can reflect spectral variations of different allophones; (2) segmenting the recording sentences into phonetic segments; (3) select good instances for each synthesis unit to generate best synthesis sentence during the run time. These processes are all derived through the use of probabilistic learning methods which are aimed at the same optimization criteria. Through this automatic unit generation, Whistler can automatically produce synthetic speech that sounds very natural and resembles the acoustic characteristics of the original speaker}, added-at = {2021-02-01T10:51:23.000+0100}, address = {Seattle, WA, USA}, author = {Hon, Hsiao-Wuen and Acero, Alex and Huang, Xuedong and Liu, Jingsong and Plumpe, Mike}, biburl = {https://www.bibsonomy.org/bibtex/2fb12f62117cd5f3142586e163ee877f3/m-toman}, booktitle = {Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, doi = {10.1109/ICASSP.1998.674425}, file = {:pdfs/hon_icassp_1998.pdf:PDF}, interhash = {952bd33c78fd912ed6c3f14e282e3776}, intrahash = {fb12f62117cd5f3142586e163ee877f3}, issn = {1520-6149}, keywords = {(artificial characteristics;allophones;automatic criteria;phonetic data databases;synthesis engine;acoustic generation;Databases;Engines;Humans;Learning generation;automatic generation;design;optimization intelligence);optimisation;speech learning learning;recording methods;Speech segments;probabilistic sentences;scaleable speech;trainable synthesis synthesis;Synthesizers;Training synthesis;Whistler systems;Character systems;Loudspeakers;Optimization text-to-speech unit unit;segmentation;spectral units;synthetic variations;speech}, month = may, owner = {schabus}, pages = {293-296}, timestamp = {2021-02-01T10:51:23.000+0100}, title = {Automatic generation of synthesis units for trainable text-to-speech systems}, volume = 1, year = 1998 }

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Automatic generation of synthesis units for trainable text-to-speech systems

Abstract

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Automatic generation of synthesis units for trainable text-to-speech systems

Abstract

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Automatic generation of synthesis units for trainable text-to-speech systems

Comments and Reviews
(0)