[DECtalk] Intelligibility/Listenability criteria

Don Text_to_Speech at GMX.com
Sun Jul 21 00:53:27 EDT 2019


Hi,

Perhaps a bit off-topic for this list... if so, my apologies.

I'm looking for opinions as to how one evaluates the "effectiveness"
of a particular synthesizer.  Said another way, how one decides that
synthesizer A is "better" than synthesizer B.  Ideally, criteria that
would allow you to rank a set of them!

I've been auditioning various synthesis devices and techniques
to try to come to my own conclusions on this.  Then, hopefully,
work backwards to come up with some objective criteria by which
they could each be "scored" (even if that was done using bogus
rating units).

"Intelligibility" is, of course, the prime issue.  "Listenability"
coming into play for any prolonged use.  Finally, "naturalness"
when it comes to extended use.

For example, the old Votrax units were intelligible -- once you
learned their "accent".  But, listenability was rather poor... you
quickly developed ear fatigue.  And, the idea of naturalness was
never even considered!

With gobs of resources (hardware, software, processing power), you
can achieve quite acceptable results.  This seems to be the approach
most "modern" synthesizers -- and techniques -- adopt.  The real problem
lies with limited resources attempting to handle unconstrained input.
(If you know what you're going to be asked to speak, it's really easy to
come up with a good presentation!)

Limiting the user's exposure to the synthetic voice can reduce ear fatigue.
So, dealing with it for 10 minutes might be tolerable while 2 hours
would be torture.

But, having to face the prospect of completely unconstrained input can
tax even that brief usage.  "Dr. Jones' car -- bearing the license plate
FTDKTR -- has been parked in front of his house on Jones Dr. since 12:34A
this morning when his Polish butler finished polishing it."  Imagine you
have no other way of inspecting the input text...

So, what makes a synthesizer "tolerable" or "intolerable"?  What is the
"threshold of pain" when it comes to tolerating an underperforming
synthesizer?


More information about the Dectalk mailing list