[DECtalk] Intelligibility/Listenability criteria
Jayson Smith
jaybird at bluegrasspals.com
Tue Jul 23 13:47:08 EDT 2019
Hi,
You bring up many interesting points and problems.
One thought that just came to my mind is Stephen Hawking. For many years
he used a speech synthesizer called the CallText 5010 by Speech Plus.
But unlike most members on this list, this device was his voice, period.
This obviously means people who were not accustomed to speech synthesis
were exposed to his synthetic voice. This is especially true in the 80's
and 90's, when synthetic speech wasn't nearly as mainstream as it is now
E.G. no Alexa/Google/Siri, no or very few synthetic voiced robocalls
reminding you of doctor appointments, etc. His voice sounded similar to
DECtalk, so the two were often confused, to the point that the DECtalk
article on Wikipedia claims that Hawking used a form of DECtalk. The
interesting thing about this particular synthesizer is that an emulation
of it was completed, to be used by Hawking, a few months before his
death. This brings this classic voice into the modern world, but
unfortunately I assume it's likely that none of us will ever get a
chance to play with it. According to an article, the source code was
turned over to the Hawking estate, which certainly has reason to protect
that particular voice. I know one friend who's told me that if this
voice were made available, he'd use it as his default screen reader voice.
As I see it, the problem with your scenario of the crippled voice in the
earpiece is that you can't please everyone no matter how hard you try.
Ideally, if the big box in the closet can send synthetic speech to the
earpiece once the user is properly authenticated, that's an audio
stream, so it should be able to send a pre-recorded file to
unauthenticated/unregistered users. This pre-recorded file could be
spoken by a more capable synthesizer, or even by an actual human, who
can clearly state the message. If spoken by a human, in theory that
human would know, or could find out, how to properly pronounce unusual
names, places, etc.
This leaves our poor crippled synthesizer in the earpiece to deal with
those situations where no big box can be reached at all, or a situation
has arisen for which no canned recording is available. Once again, you
can't please everybody. If you're designing the big boxes and writing
their documentation, you can include best practices to insure that when
administrators write error messages, they make them as friendly to the
particular synthesizer you've chosen as possible. But that doesn't mean
the admins have to follow your recommendations. And then maybe there's
the newbie admin who is rushing through installation and configuration
and just wants to get this up and running as quickly as possible, skims
through your docs briefly, and some poor user who needs to register gets
the following mess, spoken by their earpiece's crippled synthesizer:
Sorry we dont recognise your device ID. Please clal dr Johnson at
8178446611 Tahnk you
Now for those situations where no big box can be found at all, assuming
you're in control of the firmware for the earpieces, you know exactly
how your crippled synthesizer works, and can work around any quirks it
has in order to provide the most understandable messages possible.
I hope this helps,
Jayson
On 7/23/2019 1:10 PM, Don wrote:
> On 7/23/2019 6:38 AM, Jayson Smith wrote:
>> Hi,
>>
>> A few points here.
>> First, I think it's a little of both pronunciation and the voice
>> itself that
>> gets on my nerves with ESpeak.
>>
>> Second, I'd argue that Alex and Alexa do have to contend with
>> unrestrained
>> input. If I go into my Alexa web portal and put
>> "Sfjsaofhdsahbfiuewfbhifgbfvbiuewqfbirewqfbiwfbiubifdsava" on my
>> shopping list,
>> then ask her to read my shopping list, she's going to have to deal
>> with that
>> horrible mess of text.
>
> Yes, but what does it end up saying? And, I suspect it says whatever
> in a very specific context: "I'm sorry, I don't find anyone who is
> selling 'Sfjsaofhdsahbfiuewfbhifgbfvbiuewqfbirewqfbiwfbiubifdsava'".
>
> It doesn't have to contend with trying to apply prosody to
> incomplete sentences or a meaningless series of words:
> "Bob went snigglepuss" "Here is teh mising peas of the puzl"
>
> I can ensure that all of the messages that I generate for the
> synthesizer (either synthesizer!) are grammatically correct. I
> can craft them in such a way as to avoid difficult pronunciations.
> Or, to exploit known text normalization patterns (e.g., presenting
> "2,019" to the synthesizer when I want it to say "two thousand and
> nineteen" instead of "twenty nineteen".)
>
> But, I can't guarantee that folks who extend my design will be
> as disciplined. Yet, the user will have to contend with the
> "input text" chosen by those folks for their extensions!
>
> I don't think it is acceptable (or ethical) to say "that's not
> MY problem!"
>
> _______________________________________________
> Dectalk mailing list
> Dectalk at bluegrasspals.com
> http://bluegrasspals.com/mailman/listinfo/dectalk
>
>
More information about the Dectalk
mailing list