[DECtalk] Some DECtalk history and what I think we can and can't reasonably do
jake mcmahan
mcmahan.jake at gmail.com
Wed Aug 3 16:09:58 EDT 2011
On 8/3/2011 3:42 PM, ebruckert Bruckert wrote:
> Okay as an update listen to the to wave files separately not
> back-and-forth listen to one we waited a few minutes listen to the
> other. See if you agree were getting closer, one of course is what you
> sent me
>
> On Wed, Aug 3, 2011 at 1:43 PM, ebruckert Bruckert
> <edbruckert at gmail.com <mailto:edbruckert at gmail.com>> wrote:
>
> agreed
>
>
> On Wed, Aug 3, 2011 at 1:38 PM, Alex H. <linuxx64.bashsh at gmail.com
> <mailto:linuxx64.bashsh at gmail.com>> wrote:
>
> I, too, hope that HLsyn eventually will be a viable option and
> we could use the old method or HLsyn if we wanted, maybe for
> reading long texts and so on. It's a great idea and theory but
> just isn't mature enough at this point.
>
> Alex
>
>
>
>
> On 8/3/2011 1:13 PM, ebruckert Bruckert wrote:
>> There's always two sides to a coin, if DECtalk hadn't been
>> purchased it would have died. And since there was no money
>> from anyone to work on handicapped applications, we had to do
>> what our customers want it or go home. I recognize that the
>> HLsyn work did not yield the hoped-for results and perhaps
>> someday it can with what we learned in our failures. But it
>> was a decision based on the best knowledge we had at the time
>> and in fact also with Dennis Klatt's work. The problems that
>> occurred with the HL sin version aren't of any interest to me
>> because the version put out was in early one and it's not the
>> right time to pursue trying to perfect HLsyn. S
>> On all I can do is my best.
>> As to the person that mentioned the idea of putting
>> meaning into the text. DECtalk actually has the ability to do
>> some marketing and adjustment to train achieve that by hand.
>> Automating the system to do that is deal beyond our knowledge
>> and capability. Understanding what is being conveyed is
>> extremely extremely difficult for a computer. A simple
>> example;"You did that." Depending on which word you emphasize
>> most there are three different ways of saying this very
>> simple sentence with dramatically different meanings.
>> Wed, Aug 3, 2011 at 12:07 PM, Alex H.
>> <linuxx64.bashsh at gmail.com
>> <mailto:linuxx64.bashsh at gmail.com>> wrote:
>>
>> Well, to us,, we never really heard later versions of DT,
>> only the classics from the 90's, so forgive us if we
>> compare the new attempts to prior versions - it's not
>> like we have a huge library of source code to just browse
>> at will and endless samples of every version.... so... yeah.
>>
>> Wanna know what's been wrong with the samples and
>> attempts posted to this list a few months ago for the
>> sapi dectalk? I'll tell you.
>>
>> The voices were clipping and squawking, and all the
>> voices sounded like they had a speech problem. Perfect
>> Paul wasn't perfect as most of us have heard before. The
>> voices themselves sound not like DECTalk at all, they
>> also drop out in volume, just like a human cuz it's using
>> HLsyn to make it sound more natural.
>> I've heard DT 4.2cd, 4.3, 4.4, 4.61, 4.62 and 4.64. But
>> since you've pointed out before that version numbers
>> don't matter to speak, is this even important anyway or
>> are we just listening to the same code with minor tweaks
>> to get the various versions we know?
>>
>> Disable HLsyn in the new product, and it'll suck less. I
>> like forment based synths, not ones that try and sound
>> human, because I and others are used to classic forment
>> non-HLsyn versions of DECTalk. True that HLsyn is still
>> formant but it's trying to sound real and have human
>> articulation, and knowing that I can understand why this
>> version sounds different. It's just not what we're used
>> to, that's all. Some Joe Blow off the street who has
>> never heard synthesized speech can't understand Eloquence
>> from DECTalk from Espeak anyways, so this point of
>> understanding speech is a moot one. They'd be better off
>> using Cepstral or some human-sampled synths and wasting
>> their hard drive space. This is being targeted at a
>> relatively small group of people who have used DECTalk
>> before and like it, so i think we're safe there. I'd
>> consider giving HLsyn another shot if it was completed.
>> But as always, corporate America screws everyone over in
>> the end, and that was the case with Dectalk. So much so,
>> that Fonix wanted to make FonixTalk and specificly try
>> and make it sound human. The result sucks.
>>
>>
>> Alex
>> On 8/3/2011 11:17 AM, ebruckert Bruckert wrote:
>>> First of all let me make you aware that I use
>>> DragonDictate, as I can't see very well and proofreading
>>> is quite painful so you'll have to forgive and interpret
>>> from mistakes the DragonDictate may make. It
>>> I was taught about form and speech synthesis by
>>> Dennis Klatt, and by reading but before my involvement
>>> with him I knew next to nothing. One of the questions in
>>> the early days was could you achieve higher
>>> intelligibility by super articulation and do better than
>>> natural speech. What testing revealed was really two
>>> things. At normal speaking rates the answer always seem
>>> to be that the closer you matched to real speech the
>>> better the intelligibility at higher speaking rates
>>> above that which humans could normally achieve things
>>> were little different and I'm not going to go into the
>>> specifics of what we did to make things better at high
>>> speed other than to say they were based on knowledge of
>>> speech perception.
>>> The second thing we learned is that listening to a
>>> synthesizer has a very fast but steep learning curve.
>>> Somewhat analogous to learning to understand a person
>>> with a strong dialect or speech impediment. One of the
>>> problems we encountered is that people often preferred
>>> the version they were used to over any succeeding
>>> version. But actual tests did not support the preference.
>>> One example is the way tilt was done inside
>>> DECtalk. The original mechanism was a crude
>>> approximation of spectral tilt. Dennis before he died
>>> developed a much more accurate (meaning matching human
>>> production) tilt filter that was not able to be
>>> incorporated to a later date. As a point of interest
>>> Dennis was so dedicated that he last modified the
>>> DECtalk code 3 days before he passed away. So the
>>> spectral tilt was changed and this changed what you
>>> might consider the tone control on an old radio or
>>> record player. That is just one of many reasons why
>>> DECtalk change slightly over the years.
>>> The 5.0 DECtalk Incorporated the work of Prof. Ken
>>> Stevens who was Dennis is blessed MIT and close friend.
>>> The 5.0 code unfortunately did not yield the expected
>>> results, but we did learn a lot from the attempt. This
>>> there are even some changes to DECtalk that would
>>> change the way it sounds from any particular version,
>>> such as Intonation that I am unwilling to revert because
>>> I know for a fact that they caused loss of information.
>>> So my goal is very simple I am working to create a very
>>> functional intelligible DECtalk to put back out, I am
>>> unwilling to try and make it sound exactly like any
>>> given person wants to. I have been through this before
>>> and the year is very sensitive and if you directly
>>> comparing two versions side-by-side you not testing
>>> anything but whether did the same and that is an
>>> exercise in futility. T
>>> Any specific issues I can address. Secondly as a word of
>>> warning to listeners providing feedback. The other thing
>>> we've learned is that listeners are excellent at
>>> deciding that something is not right, but are absolutely
>>> terrible at exactly pinpointing the problem. The reason
>>> for this is quite simple people judge the output as
>>> speech which it only kinda is, by this I mean that a
>>> synthesizer can make mistakes that humans cannot
>>> possibly do and as a consequence can't possibly
>>> recognize. An example of this is that after so many
>>> years of working with it I have learned to hear a
>>> foreman that's moving too rapidly, but most people
>>> cannot hear it. This is because to make life easy we try
>>> to lead nor stuff that's not important in our language,
>>> such as the nasal lifestyles in French or the retro flex
>>> ours in American English which is Sheehan have a heckuva
>>> time hearing.
>>>
>>> _______________________________________________
>>> DECtalk mailing list
>>> DECtalk at bluegrasspals.com <mailto:DECtalk at bluegrasspals.com>
>>> http://bluegrasspals.com/mailman/listinfo/dectalk
>>
>>
>> --
>> Sent via Thunderbird.
>>
>> _______________________________________________
>> DECtalk mailing list
>> DECtalk at bluegrasspals.com <mailto:DECtalk at bluegrasspals.com>
>> http://bluegrasspals.com/mailman/listinfo/dectalk
>>
>>
>>
>> _______________________________________________
>> DECtalk mailing list
>> DECtalk at bluegrasspals.com <mailto:DECtalk at bluegrasspals.com>
>> http://bluegrasspals.com/mailman/listinfo/dectalk
>
>
> --
> Sent via Thunderbird.
>
> _______________________________________________
> DECtalk mailing list
> DECtalk at bluegrasspals.com <mailto:DECtalk at bluegrasspals.com>
> http://bluegrasspals.com/mailman/listinfo/dectalk
>
>
>
>
>
> _______________________________________________
> DECtalk mailing list
> DECtalk at bluegrasspals.com
> http://bluegrasspals.com/mailman/listinfo/dectalk
Ed, good mighty lord, you're doing exelent dude.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://bluegrasspals.com/pipermail/dectalk/attachments/20110803/d9eff1b2/attachment.html>
More information about the Dectalk
mailing list