[DECtalk] DECtalk TTS licensing

Devin Prater r.d.t.prater at gmail.com
Mon Aug 30 09:40:50 EDT 2021


Man, this should be spread far and wide throughout TTS circles. Screen
readers should do this too. But we've not even gotten past simple API
readers almost. Well we now have OCR and image recognition in some of the
screen readers, but TTS is still a sort of binary thing with "is speaking"
"is not speaking" at a user-defined rate and pitch and volume and sometimes
intonation. I'd love to see more screen readers become more like Emacspeak.
Devin Prater
r.d.t.prater at gmail.com
gemini://tilde.pink/~devinprater/



On Mon, Aug 30, 2021 at 8:34 AM Don <Text_to_Speech at gmx.com> wrote:

> On 8/30/2021 6:06 AM, Devin Prater wrote:
> > I mean, there is ESpeak.
>
> There are *lots* of (FOSS/exposed) synthesizers out there,
> if you are looking to incorporate one into a product (and
> likely need to work with sources)!
>
> It's an ancient technology that has been replicated by many
> people in many different ways over the past 4 decades (longer
> if you want to look at cruder synthesis technologies).  You
> just have to decide what features you want *in* the synthesizer
> and what resources you have to devote to it.
>
> ["Making noises" that sound like bits of speech is relatively easy]
>
> DECtalk made sense in the 80's -- when resources were scarce
> and the synthesizer had to be a "bag" that was bolted onto
> an existing product/system.  It was a "one-size-fits-all",
> standalone solution to "converting text into speech".  But,
> *it* had no idea what purpose it was serving in any particular
> application.  So, it could never adjust its approach to
> synthesis to match the expectations made of it.
>
> Nowadays, one would *integrate* the synthesizer into the
> product/system to improve performance, intelligibility, etc.
>
> Do you really think ONE set of synthesis rules should apply to:
> - reading your email
> - reading a URL
> - reading a password
> - reading a web page
> - reading a novel
> - reading a child's book
> - reading stock tickers/quotes
> - reading picture captions
>
> A synthesizer should understand context; HOW it is being used
> AT THIS PARTICULAR TIME.  If you push that responsibility into
> the application that is driving the synthesizer, then much
> of the value of the "TTS as black box" disappears -- you're
> doing the work *for* it!
>
> Why not just implement a "LETTER-to-speech" synthesizer?  And,
> *spell* everything, out loud (Ans:  because, while it could
> TRULY be "one-size-fits-all" -- because it pushes all of the
> real work into the listener's brain -- it would be incredibly
> unuseful)
> _______________________________________________
> Dectalk mailing list
> Dectalk at bluegrasspals.com
> https://bluegrasspals.com/mailman/listinfo/dectalk
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://bluegrasspals.com/pipermail/dectalk/attachments/20210830/66b50d30/attachment.html>


More information about the Dectalk mailing list