[DECtalk] Some DECtalk history and what I think we can and can't reasonably do

johnnybrl at gmail.com johnnybrl at gmail.com
Fri Aug 5 01:14:52 EDT 2011


wow! that almost sounds like the older versions of Dectalk! does the inflection sound as good as the older dectalks do? For example, if you were to have dectalk say, I want to go home. now! The older versions had a lot of inflections in it, which I really liked. not many synthes nowadays have that much expressive inflection as dectalk has, even if the inflections of other synths are set to max.
  ----- Original Message ----- 
  From: ebruckert Bruckert 
  To: DECtalk Discussions 
  Sent: Wednesday, August 03, 2011 12:42 PM
  Subject: Re: [DECtalk] Some DECtalk history and what I think we can and can't reasonably do


  Okay as an update listen to the to wave files separately not back-and-forth listen to one we waited a few minutes listen to the other. See if you agree were getting closer, one of course is what you sent me


  On Wed, Aug 3, 2011 at 1:43 PM, ebruckert Bruckert <edbruckert at gmail.com> wrote:

    agreed 



    On Wed, Aug 3, 2011 at 1:38 PM, Alex H. <linuxx64.bashsh at gmail.com> wrote:

      I, too, hope that HLsyn eventually will be a viable option and we could use the old method or HLsyn if we wanted, maybe for reading long texts and so on. It's a great idea and theory but just isn't mature enough at this point.

      Alex 




      On 8/3/2011 1:13 PM, ebruckert Bruckert wrote: 
        There's always two sides to a coin, if DECtalk hadn't been purchased it would have died. And since there was no money from anyone to work on handicapped applications, we had to do what our customers want it or go home. I recognize that the HLsyn work did not yield the hoped-for results and perhaps someday it can with what we learned in our failures. But it was a decision based on the best knowledge we had at the time and in fact also with Dennis Klatt's work. The problems that occurred with the HL sin version aren't of any interest to me because the version put out was in early one and it's not the right time to pursue trying to perfect HLsyn. S

        On all I can do is my best. 
           As to the person that mentioned the idea of putting meaning into the text. DECtalk actually has the ability to do some marketing and adjustment to train achieve that by hand. Automating the system to do that is deal beyond our knowledge and capability. Understanding what is being conveyed is extremely extremely difficult for a computer. A simple example;"You did that." Depending on which word you emphasize most there are three different ways of saying this very simple sentence with dramatically different meanings. 


         Wed, Aug 3, 2011 at 12:07 PM, Alex H. <linuxx64.bashsh at gmail.com> wrote:

          Well, to us,, we never really heard later versions of DT, only the classics from the 90's, so forgive us if we compare the new attempts to prior versions - it's not like we have a huge library of source code to just browse at will and endless samples of every version.... so... yeah.

          Wanna know what's been wrong with the samples and attempts posted to this list a few months ago for the sapi dectalk? I'll tell you.

          The voices were clipping and squawking, and all the voices sounded like they had a speech problem. Perfect Paul wasn't perfect as most of us have heard before. The voices themselves sound not like DECTalk at all, they also drop out in volume, just like a human cuz it's using HLsyn to make it sound more natural. 
          I've heard DT 4.2cd, 4.3, 4.4, 4.61, 4.62 and 4.64. But since you've pointed out before that version numbers don't matter to speak, is this even important anyway or are we just listening to the same code with minor tweaks to get the various versions we know?

          Disable HLsyn in the new product, and it'll suck less. I like forment based synths, not ones that try and sound human, because I and others are used to classic forment non-HLsyn versions of DECTalk. True that HLsyn is still formant but it's trying to sound real and have human articulation, and knowing that I can understand why this version sounds different. It's just not what we're used to, that's all. Some Joe Blow off the street who has never heard synthesized speech can't understand Eloquence from DECTalk from Espeak anyways, so this point of understanding speech is a moot one.  They'd be better off using Cepstral or some human-sampled synths and wasting their hard drive space. This is being targeted at a relatively small group of people who have used DECTalk before and like it, so i think we're safe there. I'd consider giving HLsyn another shot if it was completed. But as always, corporate America screws everyone over in the end, and that was the case with Dectalk. So much so, that Fonix wanted to make FonixTalk and specificly try and make it sound human. The result sucks.


          Alex
          On 8/3/2011 11:17 AM, ebruckert Bruckert wrote: 
               First of all let me make you aware that I use DragonDictate, as I can't see very well and proofreading is quite painful so you'll have to forgive and interpret from mistakes the DragonDictate may make. It
               I was taught about form and speech synthesis by Dennis Klatt, and by reading but before my involvement with him I knew next to nothing. One of the questions in the early days was could you achieve higher intelligibility by super articulation and do better than natural speech. What testing revealed was really two things. At normal speaking rates the answer always seem to be that the closer you matched to real speech the better the intelligibility at higher speaking rates above that which humans could normally achieve things were little different and I'm not going to go into the specifics of what we did to make things better at high speed other than to say they were based on knowledge of speech perception.
                 The second thing we learned is that listening to a synthesizer has a very fast but steep learning curve. Somewhat analogous to learning to understand a person with a strong dialect or speech impediment. One of the problems we encountered is that people often preferred the version they were used to over any succeeding version. But actual tests did not support the preference.
                 One example is the way tilt was done inside DECtalk. The original mechanism was a crude approximation of spectral tilt. Dennis before he died developed a much more accurate (meaning matching human production) tilt filter that was not able to be incorporated to a later date. As a point of interest Dennis was so dedicated that he last modified the DECtalk code 3 days before he passed away. So the spectral tilt was changed and this changed what you might consider the tone control on an old radio or record player. That is just one of many reasons why DECtalk change slightly over the years.
                  The 5.0 DECtalk Incorporated the work of Prof. Ken Stevens who was Dennis is blessed MIT and close friend. The 5.0 code unfortunately did not yield the expected results, but we did learn a lot from the attempt. This
                   there are even some changes to DECtalk that would change the way it sounds from any particular version, such as Intonation that I am unwilling to revert because I know for a fact that they caused loss of information. So my goal is very simple I am working to create a very functional intelligible DECtalk to put back out, I am unwilling to try and make it sound exactly like any given person wants to. I have been through this before and the year is very sensitive and if you directly comparing two versions side-by-side you not testing anything but whether did the same and that is an exercise in futility. T       

            Any specific issues I can address. Secondly as a word of warning to listeners providing feedback. The other thing we've learned is that listeners are excellent at deciding that something is not right, but are absolutely terrible at exactly pinpointing the problem. The reason for this is quite simple people judge the output as speech which it only kinda is, by this I mean that a synthesizer can make mistakes that humans cannot possibly do and as a consequence can't possibly recognize. An example of this is that after so many years of working with it I have learned to hear a foreman that's moving too rapidly, but most people cannot hear it. This is because to make life easy we try to lead nor stuff that's not important in our language, such as the nasal lifestyles in French or the retro flex ours in American English which is Sheehan have a heckuva time hearing. 
  
_______________________________________________
DECtalk mailing list
DECtalk at bluegrasspals.com
http://bluegrasspals.com/mailman/listinfo/dectalk



          -- 
          Sent via Thunderbird.

          _______________________________________________
          DECtalk mailing list
          DECtalk at bluegrasspals.com
          http://bluegrasspals.com/mailman/listinfo/dectalk





_______________________________________________
DECtalk mailing list
DECtalk at bluegrasspals.com
http://bluegrasspals.com/mailman/listinfo/dectalk



      -- 
      Sent via Thunderbird.

      _______________________________________________
      DECtalk mailing list
      DECtalk at bluegrasspals.com
      http://bluegrasspals.com/mailman/listinfo/dectalk








------------------------------------------------------------------------------


  _______________________________________________
  DECtalk mailing list
  DECtalk at bluegrasspals.com
  http://bluegrasspals.com/mailman/listinfo/dectalk
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://bluegrasspals.com/pipermail/dectalk/attachments/20110804/9c5709b8/attachment.html>


More information about the Dectalk mailing list