[DECtalk] More Gnuspeech demos added
Tony Baechler
tony at baechler.net
Sat Oct 24 05:56:50 EDT 2015
Hi,
OK, well, I have to agree with you. I apologize for the technical piece of
sample text. The new batch of demos avoids this. After reading your
comments and listening closely, I see what you mean. It's pretty bad. What
really annoys me is apparently some of the voices don't actually work.
Specifically, male, baby and small child are the same. I couldn't
understand large child. Female speaks faster than male, but it wasn't my
fault. I did not modify the voice files. I hesitate to try it without a
dictionary. Maybe with lots of work it could turn into something, but not
for now. The only correction I would make to your comments is the OS X
version is under active development and the standalone version for Linux is
based off the OS X code. I have looked at the git logs and it looks like a
lot of development was done in the summer. The lisp can be fixed, but I
haven't figured out which file to modify. I do find the lisp very annoying.
It's a mystery to me why the female speaks faster than the male. The new
demos are encoded at 96 K just in case. The standalone version isn't hard
to compile. I would encourage you to play with the configuration files and
see if you get better results.
On 10/23/2015 9:06 AM, Carlos Fernandez via Dectalk wrote:
> I believe the problems with understanding the speech are somewhat related to
> the technical input, but not entirely. I'm sorry, but I don't agree about
> the flawless pronunciation. In terms of badly mangling the order of
> phonemes, you're right, the system manages to avoid that very well. However,
> the system's intelligibility is harmed by several problems in its processing:
> 1. The accent is all wrong. The way the word is spoken often badly accents
> and thus is very unnatural, even if the pronunciation is correct.
> 2. Bad phonemes for various letters make it very strange. Some are OK, but
> several more are simply cringeworthy.
> We will start with L and R, which unnecessarily add prior vowels. For
> example, take the word "work". In English, it is of course pronounced almost
> entirely without vowel, as W, R, K, with the R elongated as is typical with
> -er and -or, -ir, and -ur in some cases. If the system was too literal, it
> might pronounce the O, making something like woark. However, GNUSpeech
> generates a sound like wairk, which I'm assuming is a representation of an
> -er phoneme that is being improperly rendered as first -e and afterword -r.
> After this, we have a problem that arises when a C is used with it's
> softer (similar to S) sound and, to a lesser extent, the actual S itself.
> This always sounds less like a standard S and lacking in high-frequency
> noise. It is thus a transitional case between the S and TH (in English). It
> is much closer to the correct version, and maybe it has something to do with
> the frequency of audio being used, but it sounds to me when I am not
> focusing on the sounds individually like the voice has a slight lisp, which
> does not make it easier to understand.
> 3. The voice seems hesitant on beginning to speak another word, but quickly
> builds up steam while crossing the word, such that, to me, the word is begun
> slowly but babbled out quickly. This creates a jerky aspect that is a bit
> difficult to handle. I am very used to using high-speed synthesizers, but
> they at least stay at one speed. Sometimes the voice will continue at its
> previous speed if the words are in the same sentence, but sometimes not.
>
> You mentioned DecTalk, Eloquence, and eSpeak in the failure to pronounce
> section, so I decided to try these on the same passage (I didn't make it all
> the way through, of course, but quite a ways in).
> Eloquence mispronounced one word, and it was copyleft. As this is more of a
> play on words than an actual dictionary term, I understand and accept this
> as a less-seen word, especially in the 1990s.
> eSpeak pronounced everything impeccably. I could not find a single error. It
> even pronounced GNU the way I do, with the G enunciated. I did not regard
> the silencing of the G for other synthesizers as an error, as a word gnu
> exists with this silent letter. The sound quality may not be everyone's cup
> of tea, but the pronunciation is clearly not lacking.
> Dectalk had a few words that were not quite mispronounced as misaccented. It
> was understandable completely through my section despite minor glitches that
> might make it slightly less desirable.
> All three, in other words, could be listened to naturally and understood
> completely, which I do not find true of GNUSpeech at this time.
>
>
> On NeXT, it is true that OS X was mostly based on the NextSTEP Operating
> System, but it has been independent of that original codebase and updated by
> apple for sixteen years. Programs that functioned for NextSTEP do not
> compile and work on OS X; the operating systems are similar but very
> different. Therefore, when the page says that the NeXT version is complete
> but the Linux one is not and gives information about obtaining a computer on
> which to run the original 1990s versions of NextSTEP, I do worry slightly on
> the logic behind this. This also leads me to wonder from where the 1990s
> code was received, as it doesn't purport to be from NeXT but some other
> company, and how (and indeed whether) the project got the rights to use it.
> As OSX and Linux are my most frequently-used operating systems, I have
> downloaded the code and will further investigate.
> Here are some quotes about NextSTEP that induced my questions. I have
> bracketed some notes inside these as well:
> "gnuspeech is currently fully available as a NextSTEP 3.x version in the SVN
> repository along with the Gnu/Linux/GNUStep version, which is incomplete
> though functional."
> "The original NeXT User and Developer Kits are complete, but do not run
> under OS X or under GNUStep on GNU/Linux. They also suffer from the
> limitations of a slow machine, so that shorter TRM lengths (< ~15 cm) cannot
> be used in real time, though the software synthesis option allows this
> restriction to be avoided."
> "In fact, you can use these passwords [why are there passwords at all? Maybe
> this is a NeXT thing?]. But you need a NeXT computer, of course—try [a
> commercial company, linked here, that sells vintage NeXT computers and
> copies of the software. They recommend the latest version, 3.3, in order to
> avoid Y2K bugs.] if you'd like one.
>
> Carlos
> On 10/23/2015 09:42, Tony Baechler via Dectalk wrote:
>> For your amusement and interest, I've added two more mp3 Gnuspeech demos,
>> including one of the female voice. As always, comments appreciated.
>>
>> http://classicradio.us/iso/
>>
>
> _______________________________________________
> Dectalk mailing list
> Dectalk at bluegrasspals.com
> http://bluegrasspals.com/mailman/listinfo/dectalk
--
--------------------
Tony Baechler, founder, Baechler Access Technology Services
Putting accessibility at the forefront of technology
mailto:bats at batsupport.com
Phone: 1-619-746-8310 Fax: 1-619-449-9898
More information about the Dectalk
mailing list