[DECtalk] Intelligibility/Listenability criteria

Aksel Leo Christoffersen aksel at blindsigtmail.dk
Sun Jul 21 06:46:03 EDT 2019


Hi Don,

I use danish voices most of the time.

For danish, I like Mette from Acapela Infovox, or Ida from the old Nuance
Vocalizer, because they're clear, high quality, and easy to understand.
Unfortunately, there aren't many male voices in danish. There're some, but
they're not very good, in my upinion.

For english, I like DECtalk, for most of the same reasons, except that it
only has a bitrate of 11025 hz.
I don't like Eloquence very mutch. For funny things, I like TruVoice and the
old Microsoft Voices.

Best regards:
Aksel Christoffersen

-----Oprindelig meddelelse-----
Fra: Dectalk [mailto:dectalk-bounces at bluegrasspals.com] På vegne af Don
Sendt: 21. juli 2019 06:53
Til: dectalk at bluegrasspals.com
Emne: [DECtalk] Intelligibility/Listenability criteria

Hi,

Perhaps a bit off-topic for this list... if so, my apologies.

I'm looking for opinions as to how one evaluates the "effectiveness"
of a particular synthesizer.  Said another way, how one decides that
synthesizer A is "better" than synthesizer B.  Ideally, criteria that
would allow you to rank a set of them!

I've been auditioning various synthesis devices and techniques
to try to come to my own conclusions on this.  Then, hopefully,
work backwards to come up with some objective criteria by which
they could each be "scored" (even if that was done using bogus
rating units).

"Intelligibility" is, of course, the prime issue.  "Listenability"
coming into play for any prolonged use.  Finally, "naturalness"
when it comes to extended use.

For example, the old Votrax units were intelligible -- once you
learned their "accent".  But, listenability was rather poor... you
quickly developed ear fatigue.  And, the idea of naturalness was
never even considered!

With gobs of resources (hardware, software, processing power), you
can achieve quite acceptable results.  This seems to be the approach
most "modern" synthesizers -- and techniques -- adopt.  The real problem
lies with limited resources attempting to handle unconstrained input.
(If you know what you're going to be asked to speak, it's really easy to
come up with a good presentation!)

Limiting the user's exposure to the synthetic voice can reduce ear fatigue.
So, dealing with it for 10 minutes might be tolerable while 2 hours
would be torture.

But, having to face the prospect of completely unconstrained input can
tax even that brief usage.  "Dr. Jones' car -- bearing the license plate
FTDKTR -- has been parked in front of his house on Jones Dr. since 12:34A
this morning when his Polish butler finished polishing it."  Imagine you
have no other way of inspecting the input text...

So, what makes a synthesizer "tolerable" or "intolerable"?  What is the
"threshold of pain" when it comes to tolerating an underperforming
synthesizer?
_______________________________________________
Dectalk mailing list
Dectalk at bluegrasspals.com
http://bluegrasspals.com/mailman/listinfo/dectalk




More information about the Dectalk mailing list