[DECtalk] More Gnuspeech demos added

Tony Baechler tony at baechler.net
Sat Oct 24 05:56:50 EDT 2015


Hi,

OK, well, I have to agree with you.  I apologize for the technical piece of 
sample text.  The new batch of demos avoids this.  After reading your 
comments and listening closely, I see what you mean.  It's pretty bad.  What 
really annoys me is apparently some of the voices don't actually work. 
Specifically, male, baby and small child are the same.  I couldn't 
understand large child.  Female speaks faster than male, but it wasn't my 
fault.  I did not modify the voice files.  I hesitate to try it without a 
dictionary.  Maybe with lots of work it could turn into something, but not 
for now.  The only correction I would make to your comments is the OS X 
version is under active development and the standalone version for Linux is 
based off the OS X code.  I have looked at the git logs and it looks like a 
lot of development was done in the summer.  The lisp can be fixed, but I 
haven't figured out which file to modify.  I do find the lisp very annoying. 
  It's a mystery to me why the female speaks faster than the male.  The new 
demos are encoded at 96 K just in case.  The standalone version isn't hard 
to compile.  I would encourage you to play with the configuration files and 
see if you get better results.

On 10/23/2015 9:06 AM, Carlos Fernandez via Dectalk wrote:
> I believe the problems with understanding the speech are somewhat related to
> the technical input, but not entirely. I'm sorry, but I don't agree about
> the flawless pronunciation. In terms of badly mangling the order of
> phonemes, you're right, the system manages to avoid that very well. However,
> the system's intelligibility is harmed by several problems in its processing:
> 1. The accent is all wrong. The way the word is spoken often badly accents
> and thus is very unnatural, even if the pronunciation is correct.
> 2. Bad phonemes for various letters make it very strange. Some are OK, but
> several more are simply cringeworthy.
> We will start with L and R, which unnecessarily add prior vowels. For
> example, take the word "work". In English, it is of course pronounced almost
> entirely without vowel, as W, R, K, with the R elongated as is typical with
> -er and -or, -ir, and -ur in some cases. If the system was too literal, it
> might pronounce the O, making something like woark. However, GNUSpeech
> generates a sound like wairk, which I'm assuming is a representation of an
> -er phoneme that is being improperly rendered as first -e and afterword -r.
>      After this, we have a problem that arises when a C is used with it's
> softer (similar to S) sound and, to a lesser extent, the actual S itself.
> This always sounds less like a standard S and lacking in high-frequency
> noise. It is thus a transitional case between the S and TH (in English). It
> is much closer to the correct version, and maybe it has something to do with
> the frequency of audio being used, but it sounds to me when I am not
> focusing on the sounds individually like the voice has a slight lisp, which
> does not make it easier to understand.
> 3. The voice seems hesitant on beginning to speak another word, but quickly
> builds up steam while crossing the word, such that, to me, the word is begun
> slowly but babbled out quickly. This creates a jerky aspect that is a bit
> difficult to handle. I am very used to using high-speed synthesizers, but
> they at least stay at one speed. Sometimes the voice will continue at its
> previous speed if the words are in the same sentence, but sometimes not.
>
> You mentioned DecTalk, Eloquence, and eSpeak in the failure to pronounce
> section, so I decided to try these on the same passage (I didn't make it all
> the way through, of course, but quite a ways in).
> Eloquence mispronounced one word, and it was copyleft. As this is more of a
> play on words than an actual dictionary term, I understand and accept this
> as a less-seen word, especially in the 1990s.
> eSpeak pronounced everything impeccably. I could not find a single error. It
> even pronounced GNU the way I do, with the G enunciated. I did not regard
> the silencing of the G for other synthesizers as an error, as a word gnu
> exists with this silent letter. The sound quality may not be everyone's cup
> of tea, but the pronunciation is clearly not lacking.
> Dectalk had a few words that were not quite mispronounced as misaccented. It
> was understandable completely through my section despite minor glitches that
> might make it slightly less desirable.
> All three, in other words, could be listened to naturally and understood
> completely, which I do not find true of GNUSpeech at this time.
>
>
> On NeXT, it is true that OS X was mostly based on the NextSTEP Operating
> System, but it has been independent of that original codebase and updated by
> apple for sixteen years. Programs that functioned for NextSTEP do not
> compile and work on OS X; the operating systems are similar but very
> different. Therefore, when the page says that the NeXT version is complete
> but the Linux one is not and gives information about obtaining a computer on
> which to run the original 1990s versions of NextSTEP, I do worry slightly on
> the logic behind this. This also leads me to wonder from where the 1990s
> code was received, as it doesn't purport to be from NeXT but some other
> company, and how (and indeed whether) the project got the rights to use it.
> As OSX and Linux are my most frequently-used operating systems, I have
> downloaded the code and will further investigate.
> Here are some quotes about NextSTEP that induced my questions. I have
> bracketed some notes inside these as well:
> "gnuspeech is currently fully available as a NextSTEP 3.x version in the SVN
> repository along with the Gnu/Linux/GNUStep version, which is incomplete
> though functional."
> "The original NeXT User and Developer Kits are complete, but do not run
> under OS X or under GNUStep on GNU/Linux. They also suffer from the
> limitations of a slow machine, so that shorter TRM lengths (< ~15 cm) cannot
> be used in real time, though the software synthesis option allows this
> restriction to be avoided."
> "In fact, you can use these passwords [why are there passwords at all? Maybe
> this is a NeXT thing?]. But you need a NeXT computer, of course—try [a
> commercial company, linked here, that sells vintage NeXT computers and
> copies of the software. They recommend the latest version, 3.3, in order to
> avoid Y2K bugs.] if you'd like one.
>
> Carlos
> On 10/23/2015 09:42, Tony Baechler via Dectalk wrote:
>> For your amusement and interest, I've added two more mp3 Gnuspeech demos,
>> including one of the female voice.  As always, comments appreciated.
>>
>> http://classicradio.us/iso/
>>
>
> _______________________________________________
> Dectalk mailing list
> Dectalk at bluegrasspals.com
> http://bluegrasspals.com/mailman/listinfo/dectalk

-- 
--------------------
Tony Baechler, founder, Baechler Access Technology Services
Putting accessibility at the forefront of technology
mailto:bats at batsupport.com
Phone: 1-619-746-8310   Fax: 1-619-449-9898


More information about the Dectalk mailing list