[DECtalk] DECtalk TTS licensing

Mon Aug 30 17:55:46 EDT 2021

Don,

> No.  You can keep "your code" separate and unencumbered by supporting
> an interface that allows your code to be replaced by something else.

No, sorry, but you are confusing LGPL and GPL licenses.
Code under GPL cannot be dynamically linked with proprietary code if it happens within a single product.
In addition to the license, this is separately explained in the FAQ of GPL.

> There are several versions of Klatt's synthesizer out there.
> Most don't have the LTS algorithm coded completely.  Buy a
> copy of "MITalk:  From Text to Speech" as it will give you a good
> grounding in what goes on inside a synthesizer (DECtalk, specifically)

I am not ready to develop speech synthesis technology on my own. I want to use a ready-made solution.
When I talk about the public domain Klatt synthesizer, I mean this version - https://github.com/mattiasgustavsson/libs
(See the speech.h file in this repository.)
Unfortunately, this version is noticeably worse than even eSpeak.

> Flite is considerably smaller than Festival.  Festival is intended for
> research into synthesis as well as imparting knowledge of content into
> the synthesizer.  I.e., it is more likely to get the pronunciation
> of a particular string of characters in a sentence correct than
> DECtalk would (because it looks at more context).

Sorry, I don't know if you understand the specifics of using speech synthesizers in screen readers.
In a screen reader, an incomplete sentence is often sent to a synthesizer.
For example, in NVDA, strings of about 100 characters are sent to the synthesizer. Moreover, this can be the end of one and the beginning of another sentence.
Therefore, the ability to preserve context for speech synthesizers in screen readers is largely irrelevant. This is often even harmful.
This is not to mention the fact that users can specifically read words or chunks of lines.

> Why does it have to be compact?  If you are hosting it on a PC,
> you've got gobs of resources!

It will also be used on low-cost, low-storage smartphones.
In addition, users may want to carry it on a USB flash drive as part of a NVDA portable.
The user will be able to install large high-quality voices additionally and connect via common interfaces, if he needs it.
I am interested in compact robotic synthesizers as basic built-in voices.

-----Original Message-----
From: Dectalk <dectalk-bounces at bluegrasspals.com> On Behalf Of Don
Sent: Monday, August 30, 2021 11:49 PM
To: dectalk at bluegrasspals.com
Subject: Re: [DECtalk] DECtalk TTS licensing

On 8/30/2021 10:39 AM, Nikita wrote:
> Unfortunately, eSpeak is distributed under the viral copyleft GPL V3
> license.
>
> This means that it cannot be used in conjunction with proprietary code, but
> there are proprietary components in my project.

No.  You can keep "your code" separate and unencumbered by supporting
an interface that allows your code to be replaced by something else.
I.e., let that interface be whatever API espeak publishes; treat your
code like a "bolt-on" to espeak.

> eSpeak can be used as an external component, but then there is less control
> over it. In particular, it will not be possible to include it in the
> package. This is only acceptable for NVDA, where there is already a built-in
> eSpeak.

Why can't you modify eSpeak -- making those modifications public
(and altering the interface described above)?

> A similar problem with the open source RHVoice synthesizer.
>
> In addition to DECtalk, I also researched the possibility of licensing
> SoftVoice and Eloquence synthesizers, but this also has not worked out yet.
> SoftVoice copyright holders are also unresponsive, and finding an Eloquence
> licensor has proven difficult.
>
> There is a PICO TTS free synthesizer, but its speech at high speed is not as
> intelligible as I would like.

Ah, there's the rub!  How do you put a number on "intelligibility"?
I've written 5 synthesizers with the goal of trying to work-around
some aspect of "(un)intelligibility" in each.  I can ask 500 people
which they think is "best" -- and get 100 people picking each!
(i.e., what one person finds appealing, others want to avoid)

> There is also a public domain Klatt synthesizer, but it is of too low
> quality.

There are several versions of Klatt's synthesizer out there.
Most don't have the LTS algorithm coded completely.  Buy a
copy of "MITalk:  From Text to Speech" as it will give you a good
grounding in what goes on inside a synthesizer (DECtalk, specifically)

> MBROLA and Festeval/Flite synthesizers are too large and clumsy, and also
> not intelligible enough at high speech speeds.

Again, "intelligible".

Flite is considerably smaller than Festival.  Festival is intended for
research into synthesis as well as imparting knowledge of content into
the synthesizer.  I.e., it is more likely to get the pronunciation
of a particular string of characters in a sentence correct than
DECtalk would (because it looks at more context).

> If you can recommend any other compact English synthesizer that I have not
> named, then I would be grateful.

Why does it have to be compact?  If you are hosting it on a PC,
you've got gobs of resources!

I run one of my synthesizers *in* a bluetooth earpiece.  No gobs of
memory.  No disk drive.  Extremely limited power consumption
(so no big fancy processor).

But, I don't expect it to correctly pronounce every possible sentence
that can be constructed!

"The pt presented at 0234 ZULU with an initial Dx of Coccidioidomycosis."

> My synthesizer will have several built-in robotic voices for different
> languages with automatic voice switching, as well as support for
> connecting external voices for other languages or for high speech quality
> from the system (MS SAPI5, Android, etc.). The synthesizer will be supplied
> for NVDA, SAPI5 and Android Speech TTS (in the future, it is possible for
> macOS and Linux in a version compatible with the corresponding API).
>
> I am currently looking for a compact robotic voice for English to embed it
> as the default voice for Latin script.
>
> The synthesizer I am developing is geared towards blind professionals
> working with different languages. Voices will automatically switch based on
> script (alphabet) and language markup, for example, on web pages.
>
> My project is non-commercial, but I am ready to buy an embedded license for
> my own money, if the price is reasonable for my personal budget. However, so
> far I cannot even establish contact with the copyright holders of the
> necessary technologies in order to find out the licensing conditions.
>
> In addition, the situation is complicated by the fact that I am not from the
> USA. I can draw up contracts only through a non-profit organization in
> another country.

_______________________________________________
Dectalk mailing list
Dectalk at bluegrasspals.com
https://bluegrasspals.com/mailman/listinfo/dectalk