[DECtalk] More Gnuspeech demos added

Carlos Fernandez cf530a at gmail.com
Fri Oct 23 12:06:33 EDT 2015


I believe the problems with understanding the speech are somewhat 
related to the technical input, but not entirely. I'm sorry, but I don't 
agree about the flawless pronunciation. In terms of badly mangling the 
order of phonemes, you're right, the system manages to avoid that very 
well. However, the system's intelligibility is harmed by several 
problems in its processing:
1. The accent is all wrong. The way the word is spoken often badly 
accents and thus is very unnatural, even if the pronunciation is correct.
2. Bad phonemes for various letters make it very strange. Some are OK, 
but several more are simply cringeworthy.
We will start with L and R, which unnecessarily add prior vowels. For 
example, take the word "work". In English, it is of course pronounced 
almost entirely without vowel, as W, R, K, with the R elongated as is 
typical with -er and -or, -ir, and -ur in some cases. If the system was 
too literal, it might pronounce the O, making something like woark. 
However, GNUSpeech generates a sound like wairk, which I'm assuming is a 
representation of an -er phoneme that is being improperly rendered as 
first -e and afterword -r.
     After this, we have a problem that arises when a C is used with 
it's softer (similar to S) sound and, to a lesser extent, the actual S 
itself. This always sounds less like a standard S and lacking in 
high-frequency noise. It is thus a transitional case between the S and 
TH (in English). It is much closer to the correct version, and maybe it 
has something to do with the frequency of audio being used, but it 
sounds to me when I am not focusing on the sounds individually like the 
voice has a slight lisp, which does not make it easier to understand.
3. The voice seems hesitant on beginning to speak another word, but 
quickly builds up steam while crossing the word, such that, to me, the 
word is begun slowly but babbled out quickly. This creates a jerky 
aspect that is a bit difficult to handle. I am very used to using 
high-speed synthesizers, but they at least stay at one speed. Sometimes 
the voice will continue at its previous speed if the words are in the 
same sentence, but sometimes not.

You mentioned DecTalk, Eloquence, and eSpeak in the failure to pronounce 
section, so I decided to try these on the same passage (I didn't make it 
all the way through, of course, but quite a ways in).
Eloquence mispronounced one word, and it was copyleft. As this is more 
of a play on words than an actual dictionary term, I understand and 
accept this as a less-seen word, especially in the 1990s.
eSpeak pronounced everything impeccably. I could not find a single 
error. It even pronounced GNU the way I do, with the G enunciated. I did 
not regard the silencing of the G for other synthesizers as an error, as 
a word gnu exists with this silent letter. The sound quality may not be 
everyone's cup of tea, but the pronunciation is clearly not lacking.
Dectalk had a few words that were not quite mispronounced as 
misaccented. It was understandable completely through my section despite 
minor glitches that might make it slightly less desirable.
All three, in other words, could be listened to naturally and understood 
completely, which I do not find true of GNUSpeech at this time.


On NeXT, it is true that OS X was mostly based on the NextSTEP Operating 
System, but it has been independent of that original codebase and 
updated by apple for sixteen years. Programs that functioned for 
NextSTEP do not compile and work on OS X; the operating systems are 
similar but very different. Therefore, when the page says that the NeXT 
version is complete but the Linux one is not and gives information about 
obtaining a computer on which to run the original 1990s versions of 
NextSTEP, I do worry slightly on the logic behind this. This also leads 
me to wonder from where the 1990s code was received, as it doesn't 
purport to be from NeXT but some other company, and how (and indeed 
whether) the project got the rights to use it. As OSX and Linux are my 
most frequently-used operating systems, I have downloaded the code and 
will further investigate.
Here are some quotes about NextSTEP that induced my questions. I have 
bracketed some notes inside these as well:
"gnuspeech is currently fully available as a NextSTEP 3.x version in the 
SVN repository along with the Gnu/Linux/GNUStep version, which is 
incomplete though functional."
"The original NeXT User and Developer Kits are complete, but do not run 
under OS X or under GNUStep on GNU/Linux. They also suffer from the 
limitations of a slow machine, so that shorter TRM lengths (< ~15 cm) 
cannot be used in real time, though the software synthesis option allows 
this restriction to be avoided."
"In fact, you can use these passwords [why are there passwords at all? 
Maybe this is a NeXT thing?]. But you need a NeXT computer, of 
course—try [a commercial company, linked here, that sells vintage NeXT 
computers and copies of the software. They recommend the latest version, 
3.3, in order to avoid Y2K bugs.] if you'd like one.

Carlos
On 10/23/2015 09:42, Tony Baechler via Dectalk wrote:
> For your amusement and interest, I've added two more mp3 Gnuspeech 
> demos, including one of the female voice.  As always, comments 
> appreciated.
>
> http://classicradio.us/iso/
>



More information about the Dectalk mailing list