[DECtalk] Some DECtalk history and what I think we can and can't reasonably do
Alex H.
linuxx64.bashsh at gmail.com
Thu Aug 4 18:28:16 EDT 2011
My idea is to just do a last bit of tweakage on the voices as far as
formant parameters go, then do a initial release. It need not sound like
a particular version, just loads better than that Hlsyn stuff from
before The current version sounds like maybe a 4.5 or 4.6x version, and
that is totally fine to me. It's DECTalk and it's sounding unique, crisp
and clear. As Raymond pointed out, there's a bit of word running
together, such as "test of" but othr than that, things are looking up
for DECTalk. Any thoughts?
Alex
On 8/4/2011 1:31 PM, ebruckert Bruckert wrote:
> Here is my plan we need to now enter a release cycle where Corine and
> I now carefully to the voices on the new synthesizer bass and come out
> the first release. I'm unwilling to try to make an exact match before
> we do a first release. There are many reasons for this and the real
> issue is this is the way to really start. After the initial release
> then we worry about other details where we have to look for consensus
> on what people would like a like. Also in many areas the rules are
> highly interactive so a change may fix the exact problem you're trying
> to fix, but have unintended side effects. Also there's issues like
> shutter priority be to provide a way to better control the synthesizer
> by getting around blocked commands by the screen reader application. I
> will update the file system and get started with corine hopefully
> tomorrow. Today I'm sick as a dog so I don't want to do anything when
> I can barely think. And I am willing to continue for free to try and
> please the users as long as there is interest.
> For myself I can say I've listened to DECtalk so much, that I'm quite
> happy with the version we have right now.
> As a point of interest what I think I have learned so far is that the
> single biggest issue was spectral tilt, when we incorporated change
> made by Dennis which from a speech standpoint is more correct meaning
> more natural in a spectral range. But from the overwhelming reaction
> we have anecdotal proof that this spectral shape is better for users.
> This is actually not terribly surprising because on the other side of
> the coin we lack the higher formants because for compute and other
> reasons it was impossible to add these to the synthesizer. At this
> point theoretically we could add them in but it's is fairly large
> effort because we'd have to go from integer arithmetic to floating
> point for the vocal track as were presently at the limit of what we
> can do with 16-bit integers.
>
> On Thu, Aug 4, 2011 at 7:46 AM, FRIDO ORDEMANN
> <enablerehab at verizon.net <mailto:enablerehab at verizon.net>> wrote:
>
> i can't tell the difference when listening as Ed suggests - excellent!
> thanks, Ed
>
> *From:* Alex H. <linuxx64.bashsh at gmail.com
> <mailto:linuxx64.bashsh at gmail.com>>
> *To:* dectalk at bluegrasspals.com <mailto:dectalk at bluegrasspals.com>
> *Sent:* Wed, August 3, 2011 4:34:48 PM
>
> *Subject:* Re: [DECtalk] Some DECtalk history and what I think we
> can and can't reasonably do
>
> Agreed. This new sample rules. It's pretty darn close to the
> original and has its own coolness..
>
> alex
>
> On 8/3/2011 4:09 PM, jake mcmahan wrote:
>> On 8/3/2011 3:42 PM, ebruckert Bruckert wrote:
>>> Okay as an update listen to the to wave files separately not
>>> back-and-forth listen to one we waited a few minutes listen to
>>> the other. See if you agree were getting closer, one of course
>>> is what you sent me
>>>
>>> On Wed, Aug 3, 2011 at 1:43 PM, ebruckert Bruckert
>>> <edbruckert at gmail.com <mailto:edbruckert at gmail.com>> wrote:
>>>
>>> agreed
>>>
>>>
>>> On Wed, Aug 3, 2011 at 1:38 PM, Alex H.
>>> <linuxx64.bashsh at gmail.com
>>> <mailto:linuxx64.bashsh at gmail.com>> wrote:
>>>
>>> I, too, hope that HLsyn eventually will be a viable
>>> option and we could use the old method or HLsyn if we
>>> wanted, maybe for reading long texts and so on. It's a
>>> great idea and theory but just isn't mature enough at
>>> this point.
>>>
>>> Alex
>>>
>>>
>>>
>>>
>>> On 8/3/2011 1:13 PM, ebruckert Bruckert wrote:
>>>> There's always two sides to a coin, if DECtalk hadn't
>>>> been purchased it would have died. And since there was
>>>> no money from anyone to work on handicapped
>>>> applications, we had to do what our customers want it
>>>> or go home. I recognize that the HLsyn work did not
>>>> yield the hoped-for results and perhaps someday it can
>>>> with what we learned in our failures. But it was a
>>>> decision based on the best knowledge we had at the time
>>>> and in fact also with Dennis Klatt's work. The problems
>>>> that occurred with the HL sin version aren't of any
>>>> interest to me because the version put out was in early
>>>> one and it's not the right time to pursue trying to
>>>> perfect HLsyn. S
>>>> On all I can do is my best.
>>>> As to the person that mentioned the idea of putting
>>>> meaning into the text. DECtalk actually has the ability
>>>> to do some marketing and adjustment to train achieve
>>>> that by hand. Automating the system to do that is deal
>>>> beyond our knowledge and capability. Understanding what
>>>> is being conveyed is extremely extremely difficult for
>>>> a computer. A simple example;"You did that." Depending
>>>> on which word you emphasize most there are three
>>>> different ways of saying this very simple sentence with
>>>> dramatically different meanings.
>>>> Wed, Aug 3, 2011 at 12:07 PM, Alex H.
>>>> <linuxx64.bashsh at gmail.com
>>>> <mailto:linuxx64.bashsh at gmail.com>> wrote:
>>>>
>>>> Well, to us,, we never really heard later versions
>>>> of DT, only the classics from the 90's, so forgive
>>>> us if we compare the new attempts to prior versions
>>>> - it's not like we have a huge library of source
>>>> code to just browse at will and endless samples of
>>>> every version.... so... yeah.
>>>>
>>>> Wanna know what's been wrong with the samples and
>>>> attempts posted to this list a few months ago for
>>>> the sapi dectalk? I'll tell you.
>>>>
>>>> The voices were clipping and squawking, and all the
>>>> voices sounded like they had a speech problem.
>>>> Perfect Paul wasn't perfect as most of us have
>>>> heard before. The voices themselves sound not like
>>>> DECTalk at all, they also drop out in volume, just
>>>> like a human cuz it's using HLsyn to make it sound
>>>> more natural.
>>>> I've heard DT 4.2cd, 4.3, 4.4, 4.61, 4.62 and 4.64.
>>>> But since you've pointed out before that version
>>>> numbers don't matter to speak, is this even
>>>> important anyway or are we just listening to the
>>>> same code with minor tweaks to get the various
>>>> versions we know?
>>>>
>>>> Disable HLsyn in the new product, and it'll suck
>>>> less. I like forment based synths, not ones that
>>>> try and sound human, because I and others are used
>>>> to classic forment non-HLsyn versions of DECTalk.
>>>> True that HLsyn is still formant but it's trying to
>>>> sound real and have human articulation, and knowing
>>>> that I can understand why this version sounds
>>>> different. It's just not what we're used to, that's
>>>> all. Some Joe Blow off the street who has never
>>>> heard synthesized speech can't understand Eloquence
>>>> from DECTalk from Espeak anyways, so this point of
>>>> understanding speech is a moot one. They'd be
>>>> better off using Cepstral or some human-sampled
>>>> synths and wasting their hard drive space. This is
>>>> being targeted at a relatively small group of
>>>> people who have used DECTalk before and like it, so
>>>> i think we're safe there. I'd consider giving HLsyn
>>>> another shot if it was completed. But as always,
>>>> corporate America screws everyone over in the end,
>>>> and that was the case with Dectalk. So much so,
>>>> that Fonix wanted to make FonixTalk and specificly
>>>> try and make it sound human. The result sucks.
>>>>
>>>>
>>>> Alex
>>>> On 8/3/2011 11:17 AM, ebruckert Bruckert wrote:
>>>>> First of all let me make you aware that I use
>>>>> DragonDictate, as I can't see very well and
>>>>> proofreading is quite painful so you'll have to
>>>>> forgive and interpret from mistakes the
>>>>> DragonDictate may make. It
>>>>> I was taught about form and speech synthesis by
>>>>> Dennis Klatt, and by reading but before my
>>>>> involvement with him I knew next to nothing. One
>>>>> of the questions in the early days was could you
>>>>> achieve higher intelligibility by super
>>>>> articulation and do better than natural speech.
>>>>> What testing revealed was really two things. At
>>>>> normal speaking rates the answer always seem to be
>>>>> that the closer you matched to real speech the
>>>>> better the intelligibility at higher speaking
>>>>> rates above that which humans could normally
>>>>> achieve things were little different and I'm not
>>>>> going to go into the specifics of what we did to
>>>>> make things better at high speed other than to say
>>>>> they were based on knowledge of speech perception.
>>>>> The second thing we learned is that listening
>>>>> to a synthesizer has a very fast but steep
>>>>> learning curve. Somewhat analogous to learning to
>>>>> understand a person with a strong dialect or
>>>>> speech impediment. One of the problems we
>>>>> encountered is that people often preferred the
>>>>> version they were used to over any succeeding
>>>>> version. But actual tests did not support the
>>>>> preference.
>>>>> One example is the way tilt was done inside
>>>>> DECtalk. The original mechanism was a crude
>>>>> approximation of spectral tilt. Dennis before he
>>>>> died developed a much more accurate (meaning
>>>>> matching human production) tilt filter that was
>>>>> not able to be incorporated to a later date. As a
>>>>> point of interest Dennis was so dedicated that he
>>>>> last modified the DECtalk code 3 days before he
>>>>> passed away. So the spectral tilt was changed and
>>>>> this changed what you might consider the tone
>>>>> control on an old radio or record player. That is
>>>>> just one of many reasons why DECtalk change
>>>>> slightly over the years.
>>>>> The 5.0 DECtalk Incorporated the work of
>>>>> Prof. Ken Stevens who was Dennis is blessed MIT
>>>>> and close friend. The 5.0 code unfortunately did
>>>>> not yield the expected results, but we did learn a
>>>>> lot from the attempt. This
>>>>> there are even some changes to DECtalk that
>>>>> would change the way it sounds from any particular
>>>>> version, such as Intonation that I am unwilling to
>>>>> revert because I know for a fact that they caused
>>>>> loss of information. So my goal is very simple I
>>>>> am working to create a very functional
>>>>> intelligible DECtalk to put back out, I am
>>>>> unwilling to try and make it sound exactly like
>>>>> any given person wants to. I have been through
>>>>> this before and the year is very sensitive and if
>>>>> you directly comparing two versions side-by-side
>>>>> you not testing anything but whether did the same
>>>>> and that is an exercise in futility. T
>>>>> Any specific issues I can address. Secondly as a
>>>>> word of warning to listeners providing feedback.
>>>>> The other thing we've learned is that listeners
>>>>> are excellent at deciding that something is not
>>>>> right, but are absolutely terrible at exactly
>>>>> pinpointing the problem. The reason for this is
>>>>> quite simple people judge the output as speech
>>>>> which it only kinda is, by this I mean that a
>>>>> synthesizer can make mistakes that humans cannot
>>>>> possibly do and as a consequence can't possibly
>>>>> recognize. An example of this is that after so
>>>>> many years of working with it I have learned to
>>>>> hear a foreman that's moving too rapidly, but most
>>>>> people cannot hear it. This is because to make
>>>>> life easy we try to lead nor stuff that's not
>>>>> important in our language, such as the nasal
>>>>> lifestyles in French or the retro flex ours in
>>>>> American English which is Sheehan have a heckuva
>>>>> time hearing.
>>>>>
>>>>> _______________________________________________
>>>>> DECtalk mailing list
>>>>> DECtalk at bluegrasspals.com <mailto:DECtalk at bluegrasspals.com>
>>>>> http://bluegrasspals.com/mailman/listinfo/dectalk
>>>>
>>>>
>>>> --
>>>> Sent via Thunderbird.
>>>>
>>>> _______________________________________________
>>>> DECtalk mailing list
>>>> DECtalk at bluegrasspals.com
>>>> <mailto:DECtalk at bluegrasspals.com>
>>>> http://bluegrasspals.com/mailman/listinfo/dectalk
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> DECtalk mailing list
>>>> DECtalk at bluegrasspals.com <mailto:DECtalk at bluegrasspals.com>
>>>> http://bluegrasspals.com/mailman/listinfo/dectalk
>>>
>>>
>>> --
>>> Sent via Thunderbird.
>>>
>>> _______________________________________________
>>> DECtalk mailing list
>>> DECtalk at bluegrasspals.com <mailto:DECtalk at bluegrasspals.com>
>>> http://bluegrasspals.com/mailman/listinfo/dectalk
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> DECtalk mailing list
>>> DECtalk at bluegrasspals.com <mailto:DECtalk at bluegrasspals.com>
>>> http://bluegrasspals.com/mailman/listinfo/dectalk
>> Ed, good mighty lord, you're doing exelent dude.
>>
>> _______________________________________________
>> DECtalk mailing list
>> DECtalk at bluegrasspals.com <mailto:DECtalk at bluegrasspals.com>
>> http://bluegrasspals.com/mailman/listinfo/dectalk
>
>
> --
> Sent via Thunderbird.
>
> _______________________________________________
> DECtalk mailing list
> DECtalk at bluegrasspals.com <mailto:DECtalk at bluegrasspals.com>
> http://bluegrasspals.com/mailman/listinfo/dectalk
>
>
>
> _______________________________________________
> DECtalk mailing list
> DECtalk at bluegrasspals.com
> http://bluegrasspals.com/mailman/listinfo/dectalk
--
Sent via Thunderbird.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://bluegrasspals.com/pipermail/dectalk/attachments/20110804/a0c5d168/attachment.html>
More information about the Dectalk
mailing list