[DECtalk] DECtalk's development, and old speech synthesizer recordings

Don Text_to_Speech at GMX.com
Fri Nov 4 07:25:45 EDT 2022


> 2: What did Dennis Klatt's own voice sound like, and how much was he
> involved with DEC's development of DECtalk, after they licensed his work?

For a description (in DK's own words) of the transition process, see:
"How Klattalk became DECtalk: An Academic's Experiences in the Business World"

> I haven't been able to find any recordings of Klatt's original voice on the
> internet, so I think this is really interesting. Personally, I find it most
> likely that his voice sounded more like the recordings in his collection,
> because they, to me, sound more human and naturalt han DECtalk 1.8 and 2.0.

Saying that a particular DECtalk voice was DK's is misleading.
If you read the literature, you will see reference to various
"speakers" whose voices were analyzed to determine the characteristics
of "speech".  The initials "DK" figure prominently in these lists
of subjects.

This makes sense.  Everyone -- even the author of a research paper -- has
a voice.  If doing research and needing sample voices to analyze, what
more convenient "test subject" than the author himself?

So, one can say that the *parameters* of DK's voice factored heavily
into the default parameters used for DECtalk voices.  And, thus, bore a
resemblance to his speaking STYLE.  Whether that was intentional, on
his part, or just an obvious consequence of his choice of test subjects
is unknown.

OTOH, a voice created from diphone synthesis *is* the voice of the subject
recorded the speech samples from which the diphones were extracted (as
"audio recordings").  So, had he opted to implement the synthesis
portion with diphones (leaving all of the rest of the synthesizer as is),
then it truly *would* be DK's voice that you were hearing.

> I believe I read somewhere, that the Doctor Dennis voice was supposed to
> represent Klatt's voice in later years, due to his illness breaking his
> voice, but then, it seems very strange that the later DECtalk versions would
> make the voice sound much better than the DECtalk 2.0 version, because I
> can't imagine his voice got that much better before his dead in 1988.


I think trying to draw conclusions, retroactively, from a set of sound
samples is fraught with potential misunderstandings.

Keep in mind that, what *you* know as DECtalk, today, differs significantly
from Klattalk/MITalk in terms of implementation and performance.

Klatt was an academic.  AFAICT, he had never designed a product.
Klattalk/MITalk was an "intellectual exercise" but only existed
"in the lab".  DECtalk (the "hardware" synthesizer) was the first
attempt at reifying it into a commercially viable product.

Several thousand dollars, at that time!  ($4K comes to mind).

But, when computers were still confined to universities and
businesses (recall, the first PC didn't come around until 1980),
a $4000 peripheral was par for the course -- a disk drive was
the size of a washing machine, a computer the size of a refrigerator,
etc.

The original MITalk ran on a PDP-11 minicomputer with special audio
output hardware.  Klatt didn't have to worry about accessing mass
storage -- the operating system took care of that.  He didn't have to
worry about "doing two things at one" -- the operating system took
care of that.  And, he didn't have to worry about CREATING an operating
system -- which would be required for a stand-alone device to
mimic his "minicomputer solution"!

The port to the 68000 microprocessor had to replicate his design
in a way that was feasible to implement on the (newly marketed)
microprocessor.  It had to include the necessary memory to store
his program, the parameters that drove it as well as the memory
for the input text and output waveforms.  AFAICT, there was no real
operating system in place; the program had full reign of the
hardware.

The original implementors obviously were more concerned about making
it speak than about making it do so inexpensively -- they added a
second processor (a Digital Signal Processor) to do the waveform
synthesis.  This added a lot of cost and complexity but must have
been the easiest way forward, for them, without reengineering
his solution.  (A 68000 has enough horsepower to do all of the
work without additional augmentation -- but, you have to
engineer the solution with that in mind!)

The DECtalk that you likely use (PC-based) returns the implementation
to the original environment -- on a "host computer" (instead of
on a dedicated computer).  Memory takes the form of files on a
disk, not individual "chips" (each with a co$t)

Each time the implementation is "touched", it likely incurs changes.
Some well-intended.  Some "necessary" for the "port".  Some just
hopeful improvements.

For example, MITalk was designed NOT to rely on a "pronouncing dictionary"
for anything but "a few" exceptional words.  Yet, the PC-based DECtalk
includes such a dictionary.  How often do you think "aardvark" comes up
in text?  How crucial is it for it to hold a spot in that dictionary??

There are lots of "degrees of freedom" in the implementation.
Lots of places where a developer has a choice as to how to do something.
Each choice has ramifications.  You might not be able to foresee
the consequences of these in order to avoid "unfortunate ones".

Klatt was obviously well versed in the theory of what he was trying
to do, even if his implementation skills were dubious.  The DEC folks
were likely more skilled at the (hardware) implementation and lacking
in all of the theory.

[Speech was a hot topic in the 70's.  There was a LOT of research on which
Klatt drew.  His thesis adviser -- Allen -- had tried to tackle the same
problem when *he* prepared his own thesis, a decade earlier!]

But, you can always move backwards and revisit something that *did*
work (or, worked "well enough").  Or, can make a change that *subtly*
impacts the character of the output (would you be able to hear if
a particular phoneme's duration was extended by 5%?  Or, if the
pitch contour changed subtly?  Yet, you'd "sense" a difference...)



More information about the Dectalk mailing list