[DECtalk] an article about speech synthesis methods
Text_to_Speech at GMX.com
Mon Sep 13 00:17:12 EDT 2021
On 9/12/2021 7:58 AM, Blake Roberts via Dectalk wrote:
> While browsing the most recent Top Tech Tidbits newsletter, I came across this
> blog post which explains different methods for creating speech synthesis.
> The author claims, at the time this message is written, that parametric
> synthesis is not available for screen readers. I wrote to her already about the
> parametric synthesis solution RHVoice. Some of you might find the article
> explaining various TTS methods of interest. I definitely did!
If the author's stated goal is a "fast speaking voice" that is also
"intelligible", that can be accomplished with a diphone synthesizer
(gives the enhanced intelligibility over a formant-based design)
using a speaker (as in "person who can speak") who has the ability
to speak very quickly as the "model" for the diphone inventory.
[There are folks who can speak at 600 wpm]
You can also update the clock frequency of the waveform output
to emit bits of the waveform more quickly. This has the unfortunate
side effect of also altering pitch -- like playing a 33RPM phonograph
record at 45RPM.
Alternatively, you can selectively extract redundant portions of the
speech waveform (created from a "normal" speaker) to gain increases
in throughput without the "mickey mouse" effect of arbitrarily
speeding up the timebase. Speech is highly redundant so one can
chop pieces out and not lose information.
[My thesis advisor marketed a device that does this way back in the 70's]
These aren't things that an end user is likely going to be able to do,
More information about the Dectalk