[DECtalk] Some DECtalk history and what I think we can and can't reasonably do

Thu Aug 4 18:28:16 EDT 2011

My idea is to just do a last bit of tweakage on the voices as far as 
formant parameters go, then do a initial release. It need not sound like 
a particular version, just loads better than that Hlsyn stuff from 
before The current version sounds like maybe a 4.5 or 4.6x version, and 
that is totally fine to me. It's DECTalk and it's sounding unique, crisp 
and clear. As Raymond pointed out, there's a bit of word running 
together, such as "test of" but othr than that, things are looking up 
for DECTalk. Any thoughts?

Alex

On 8/4/2011 1:31 PM, ebruckert Bruckert wrote:
> Here is my plan we need to now enter a release cycle where Corine and 
> I now carefully to the voices on the new synthesizer bass and come out 
> the first release. I'm unwilling to try to make an exact match before 
> we do a first release. There are many reasons for this and the real 
> issue is this is the way to really start. After the initial release 
> then we worry about other details where we have to look for consensus 
> on what people would like a like. Also in many areas the rules are 
> highly interactive so a change may fix the exact problem you're trying 
> to fix, but have unintended side effects. Also there's issues like 
> shutter priority be to provide a way to better control the synthesizer 
> by getting around blocked commands by the screen reader application. I 
> will update the file system and get started with corine hopefully 
> tomorrow. Today I'm sick as a dog so I don't want to do anything when 
> I can barely think. And I am willing to continue for free to try and 
> please the users as long as there is interest.
> For myself I can say I've listened to DECtalk so much, that I'm quite 
> happy with the version we have right now.
> As a point of interest what I think I have learned so far is that the 
> single biggest issue was spectral tilt, when we incorporated change 
> made by Dennis which from a speech standpoint is more correct meaning 
> more natural in a spectral range. But from the overwhelming reaction 
> we have anecdotal proof that this spectral shape is better for users. 
> This is actually not terribly surprising because on the other side of 
> the coin we lack the higher formants because for compute and other 
> reasons it was impossible to add these to the synthesizer. At this 
> point theoretically we could add them in but it's is fairly large 
> effort because we'd have to go from integer arithmetic to floating 
> point for the vocal track as were presently at the limit of what we 
> can do with 16-bit integers.
>
> On Thu, Aug 4, 2011 at 7:46 AM, FRIDO ORDEMANN 
> <enablerehab at verizon.net <mailto:enablerehab at verizon.net>> wrote:
>
>     i can't tell the difference when listening as Ed suggests - excellent!
>     thanks, Ed
>
>     *From:* Alex H. <linuxx64.bashsh at gmail.com
>     <mailto:linuxx64.bashsh at gmail.com>>
>     *To:* dectalk at bluegrasspals.com <mailto:dectalk at bluegrasspals.com>
>     *Sent:* Wed, August 3, 2011 4:34:48 PM
>
>     *Subject:* Re: [DECtalk] Some DECtalk history and what I think we
>     can and can't reasonably do
>
>     Agreed. This new sample rules. It's pretty darn close to the
>     original and has its own coolness..
>
>     alex
>
>     On 8/3/2011 4:09 PM, jake mcmahan wrote:
>>     On 8/3/2011 3:42 PM, ebruckert Bruckert wrote:
>>>     Okay as an update listen to the to wave files separately not
>>>     back-and-forth listen to one we waited a few minutes listen to
>>>     the other. See if you agree were getting closer, one of course
>>>     is what you sent me
>>>
>>>     On Wed, Aug 3, 2011 at 1:43 PM, ebruckert Bruckert
>>>     <edbruckert at gmail.com <mailto:edbruckert at gmail.com>> wrote:
>>>
>>>         agreed
>>>
>>>
>>>         On Wed, Aug 3, 2011 at 1:38 PM, Alex H.
>>>         <linuxx64.bashsh at gmail.com
>>>         <mailto:linuxx64.bashsh at gmail.com>> wrote:
>>>
>>>             I, too, hope that HLsyn eventually will be a viable
>>>             option and we could use the old method or HLsyn if we
>>>             wanted, maybe for reading long texts and so on. It's a
>>>             great idea and theory but just isn't mature enough at
>>>             this point.
>>>
>>>             Alex
>>>
>>>
>>>
>>>
>>>             On 8/3/2011 1:13 PM, ebruckert Bruckert wrote:
>>>>             There's always two sides to a coin, if DECtalk hadn't
>>>>             been purchased it would have died. And since there was
>>>>             no money from anyone to work on handicapped
>>>>             applications, we had to do what our customers want it
>>>>             or go home. I recognize that the HLsyn work did not
>>>>             yield the hoped-for results and perhaps someday it can
>>>>             with what we learned in our failures. But it was a
>>>>             decision based on the best knowledge we had at the time
>>>>             and in fact also with Dennis Klatt's work. The problems
>>>>             that occurred with the HL sin version aren't of any
>>>>             interest to me because the version put out was in early
>>>>             one and it's not the right time to pursue trying to
>>>>             perfect HLsyn. S
>>>>             On all I can do is my best.
>>>>                As to the person that mentioned the idea of putting
>>>>             meaning into the text. DECtalk actually has the ability
>>>>             to do some marketing and adjustment to train achieve
>>>>             that by hand. Automating the system to do that is deal
>>>>             beyond our knowledge and capability. Understanding what
>>>>             is being conveyed is extremely extremely difficult for
>>>>             a computer. A simple example;"You did that." Depending
>>>>             on which word you emphasize most there are three
>>>>             different ways of saying this very simple sentence with
>>>>             dramatically different meanings.
>>>>              Wed, Aug 3, 2011 at 12:07 PM, Alex H.
>>>>             <linuxx64.bashsh at gmail.com
>>>>             <mailto:linuxx64.bashsh at gmail.com>> wrote:
>>>>
>>>>                 Well, to us,, we never really heard later versions
>>>>                 of DT, only the classics from the 90's, so forgive
>>>>                 us if we compare the new attempts to prior versions
>>>>                 - it's not like we have a huge library of source
>>>>                 code to just browse at will and endless samples of
>>>>                 every version.... so... yeah.
>>>>
>>>>                 Wanna know what's been wrong with the samples and
>>>>                 attempts posted to this list a few months ago for
>>>>                 the sapi dectalk? I'll tell you.
>>>>
>>>>                 The voices were clipping and squawking, and all the
>>>>                 voices sounded like they had a speech problem.
>>>>                 Perfect Paul wasn't perfect as most of us have
>>>>                 heard before. The voices themselves sound not like
>>>>                 DECTalk at all, they also drop out in volume, just
>>>>                 like a human cuz it's using HLsyn to make it sound
>>>>                 more natural.
>>>>                 I've heard DT 4.2cd, 4.3, 4.4, 4.61, 4.62 and 4.64.
>>>>                 But since you've pointed out before that version
>>>>                 numbers don't matter to speak, is this even
>>>>                 important anyway or are we just listening to the
>>>>                 same code with minor tweaks to get the various
>>>>                 versions we know?
>>>>
>>>>                 Disable HLsyn in the new product, and it'll suck
>>>>                 less. I like forment based synths, not ones that
>>>>                 try and sound human, because I and others are used
>>>>                 to classic forment non-HLsyn versions of DECTalk.
>>>>                 True that HLsyn is still formant but it's trying to
>>>>                 sound real and have human articulation, and knowing
>>>>                 that I can understand why this version sounds
>>>>                 different. It's just not what we're used to, that's
>>>>                 all. Some Joe Blow off the street who has never
>>>>                 heard synthesized speech can't understand Eloquence
>>>>                 from DECTalk from Espeak anyways, so this point of
>>>>                 understanding speech is a moot one.  They'd be
>>>>                 better off using Cepstral or some human-sampled
>>>>                 synths and wasting their hard drive space. This is
>>>>                 being targeted at a relatively small group of
>>>>                 people who have used DECTalk before and like it, so
>>>>                 i think we're safe there. I'd consider giving HLsyn
>>>>                 another shot if it was completed. But as always,
>>>>                 corporate America screws everyone over in the end,
>>>>                 and that was the case with Dectalk. So much so,
>>>>                 that Fonix wanted to make FonixTalk and specificly
>>>>                 try and make it sound human. The result sucks.
>>>>
>>>>
>>>>                 Alex
>>>>                 On 8/3/2011 11:17 AM, ebruckert Bruckert wrote:
>>>>>                    First of all let me make you aware that I use
>>>>>                 DragonDictate, as I can't see very well and
>>>>>                 proofreading is quite painful so you'll have to
>>>>>                 forgive and interpret from mistakes the
>>>>>                 DragonDictate may make. It
>>>>>                    I was taught about form and speech synthesis by
>>>>>                 Dennis Klatt, and by reading but before my
>>>>>                 involvement with him I knew next to nothing. One
>>>>>                 of the questions in the early days was could you
>>>>>                 achieve higher intelligibility by super
>>>>>                 articulation and do better than natural speech.
>>>>>                 What testing revealed was really two things. At
>>>>>                 normal speaking rates the answer always seem to be
>>>>>                 that the closer you matched to real speech the
>>>>>                 better the intelligibility at higher speaking
>>>>>                 rates above that which humans could normally
>>>>>                 achieve things were little different and I'm not
>>>>>                 going to go into the specifics of what we did to
>>>>>                 make things better at high speed other than to say
>>>>>                 they were based on knowledge of speech perception.
>>>>>                      The second thing we learned is that listening
>>>>>                 to a synthesizer has a very fast but steep
>>>>>                 learning curve. Somewhat analogous to learning to
>>>>>                 understand a person with a strong dialect or
>>>>>                 speech impediment. One of the problems we
>>>>>                 encountered is that people often preferred the
>>>>>                 version they were used to over any succeeding
>>>>>                 version. But actual tests did not support the
>>>>>                 preference.
>>>>>                      One example is the way tilt was done inside
>>>>>                 DECtalk. The original mechanism was a crude
>>>>>                 approximation of spectral tilt. Dennis before he
>>>>>                 died developed a much more accurate (meaning
>>>>>                 matching human production) tilt filter that was
>>>>>                 not able to be incorporated to a later date. As a
>>>>>                 point of interest Dennis was so dedicated that he
>>>>>                 last modified the DECtalk code 3 days before he
>>>>>                 passed away. So the spectral tilt was changed and
>>>>>                 this changed what you might consider the tone
>>>>>                 control on an old radio or record player. That is
>>>>>                 just one of many reasons why DECtalk change
>>>>>                 slightly over the years.
>>>>>                       The 5.0 DECtalk Incorporated the work of
>>>>>                 Prof. Ken Stevens who was Dennis is blessed MIT
>>>>>                 and close friend. The 5.0 code unfortunately did
>>>>>                 not yield the expected results, but we did learn a
>>>>>                 lot from the attempt. This
>>>>>                        there are even some changes to DECtalk that
>>>>>                 would change the way it sounds from any particular
>>>>>                 version, such as Intonation that I am unwilling to
>>>>>                 revert because I know for a fact that they caused
>>>>>                 loss of information. So my goal is very simple I
>>>>>                 am working to create a very functional
>>>>>                 intelligible DECtalk to put back out, I am
>>>>>                 unwilling to try and make it sound exactly like
>>>>>                 any given person wants to. I have been through
>>>>>                 this before and the year is very sensitive and if
>>>>>                 you directly comparing two versions side-by-side
>>>>>                 you not testing anything but whether did the same
>>>>>                 and that is an exercise in futility. T
>>>>>                 Any specific issues I can address. Secondly as a
>>>>>                 word of warning to listeners providing feedback.
>>>>>                 The other thing we've learned is that listeners
>>>>>                 are excellent at deciding that something is not
>>>>>                 right, but are absolutely terrible at exactly
>>>>>                 pinpointing the problem. The reason for this is
>>>>>                 quite simple people judge the output as speech
>>>>>                 which it only kinda is, by this I mean that a
>>>>>                 synthesizer can make mistakes that humans cannot
>>>>>                 possibly do and as a consequence can't possibly
>>>>>                 recognize. An example of this is that after so
>>>>>                 many years of working with it I have learned to
>>>>>                 hear a foreman that's moving too rapidly, but most
>>>>>                 people cannot hear it. This is because to make
>>>>>                 life easy we try to lead nor stuff that's not
>>>>>                 important in our language, such as the nasal
>>>>>                 lifestyles in French or the retro flex ours in
>>>>>                 American English which is Sheehan have a heckuva
>>>>>                 time hearing.
>>>>>
>>>>>                 _______________________________________________
>>>>>                 DECtalk mailing list
>>>>>                 DECtalk at bluegrasspals.com  <mailto:DECtalk at bluegrasspals.com>
>>>>>                 http://bluegrasspals.com/mailman/listinfo/dectalk
>>>>
>>>>
>>>>                 -- 
>>>>                 Sent via Thunderbird.
>>>>
>>>>                 _______________________________________________
>>>>                 DECtalk mailing list
>>>>                 DECtalk at bluegrasspals.com
>>>>                 <mailto:DECtalk at bluegrasspals.com>
>>>>                 http://bluegrasspals.com/mailman/listinfo/dectalk
>>>>
>>>>
>>>>
>>>>             _______________________________________________
>>>>             DECtalk mailing list
>>>>             DECtalk at bluegrasspals.com  <mailto:DECtalk at bluegrasspals.com>
>>>>             http://bluegrasspals.com/mailman/listinfo/dectalk
>>>
>>>
>>>             -- 
>>>             Sent via Thunderbird.
>>>
>>>             _______________________________________________
>>>             DECtalk mailing list
>>>             DECtalk at bluegrasspals.com <mailto:DECtalk at bluegrasspals.com>
>>>             http://bluegrasspals.com/mailman/listinfo/dectalk
>>>
>>>
>>>
>>>
>>>
>>>     _______________________________________________
>>>     DECtalk mailing list
>>>     DECtalk at bluegrasspals.com  <mailto:DECtalk at bluegrasspals.com>
>>>     http://bluegrasspals.com/mailman/listinfo/dectalk
>>     Ed, good mighty lord, you're doing exelent dude.
>>
>>     _______________________________________________
>>     DECtalk mailing list
>>     DECtalk at bluegrasspals.com  <mailto:DECtalk at bluegrasspals.com>
>>     http://bluegrasspals.com/mailman/listinfo/dectalk
>
>
>     -- 
>     Sent via Thunderbird.
>
>     _______________________________________________
>     DECtalk mailing list
>     DECtalk at bluegrasspals.com <mailto:DECtalk at bluegrasspals.com>
>     http://bluegrasspals.com/mailman/listinfo/dectalk
>
>
>
> _______________________________________________
> DECtalk mailing list
> DECtalk at bluegrasspals.com
> http://bluegrasspals.com/mailman/listinfo/dectalk

-- 
Sent via Thunderbird.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://bluegrasspals.com/pipermail/dectalk/attachments/20110804/a0c5d168/attachment.html>