Simon/Licensing

From KDE Wiki Sandbox

Simon is licensed under the GPL license making it free and open source software.

However, as most larger software projects we use a lot of third party products and pragmatically decided to use those that for the time being seemed to work best. This is why one of our dependencies and the trainings modules use external software that is not licensed in a GPL compatible way.

Disclaimer

I am not a lawyer. As such everything written here is just my interpretation of various licenses and complicated legal situations.

I might be wrong.

Julius

Usage in Simon

Simon uses the Large Vocabulary Continuous Speech Recognition Engine Julius for the recognition.

The Problem

Julius, while being free and open source software as well uses the Original 4 clause BSD License which, according to Gnu is a recognized free software license but not compatible with the GPL.

Solution

Simons License includes a special exception allowing to link to Julius.

Below is the affected part of the license. The added part is highlighted.

 3. You may copy and distribute the Program (or a work based on it,
  under Section 2) in object code or executable form under the terms of
  Sections 1 and 2 above provided that you also do one of the following:
   a) Accompany it with the complete corresponding machine-readable
   source code, which must be distributed under the terms of Sections
   1 and 2 above on a medium customarily used for software interchange; or,
   b) Accompany it with a written offer, valid for at least three
   years, to give any third party, for a charge no more than your
   cost of physically performing source distribution, a complete
   machine-readable copy of the corresponding source code, to be
   distributed under the terms of Sections 1 and 2 above on a medium
   customarily used for software interchange; or,
   c) Accompany it with the information you received as to the offer
   to distribute corresponding source code.  (This alternative is
   allowed only for noncommercial distribution and only if you
   received the program in object code or executable form with such
   an offer, in accord with Subsection b above.)
The source code for a work means the preferred form of the work for
making modifications to it.  For an executable work, complete source
code means all the source code for all modules it contains, plus any
associated interface definition files, plus the scripts used to
control compilation and installation of the executable.  However, as a
special exception, the source code distributed need not include
anything that is normally distributed (in either source or binary
form) with the major components (compiler, kernel, and so on) of the
operating system on which the executable runs, unless that component
itself accompanies the executable.
In addition, as a special exception, the copyright holders give
permission to link the code of portions of this program with the
Julius library under certain conditions as described in each
individual source file, and distribute linked combinations
including the two.
You must obey the GNU General Public License in all respects
for all of the code used other than Julius.  If you modify
file(s) with this exception, you may extend this exception to your
version of the file(s), but you are not obligated to do so.  If you
do not wish to do so, delete this exception statement from your
version.  If you delete this exception statement from all source
files in the program, then also delete it here.
If distribution of executable or object code is made by offering
access to copy from a designated place, then offering equivalent
access to copy the source code from the same place counts as
distribution of the source code, even though third parties are not
compelled to copy the source along with the object code.

To the best of my knowledge, the same issue has been handled in the same way with OpenSSL: [1].


HTK

Usage in Simon

For a speech recognition to work, you need a speech model which itself consists of two parts: The first one describes what words, sentences, etc. exist and of which sounds they are composed of ("language model"). The second one describes how these phonemes sound ("acoustic model").

Simon builds and manages the language model on his own (with the help of the Julius LVCSR in some places).

If you want to create your own acoustic model instead of just using a predefined one or want to adapt a general model to your voice, Simon provides the ability to use the HTK for that.

The Problem

The HTK license is not a free license.

You are allowed to browse and edit the source code but you can not re-distribute the downloaded version of the HTK.

However, you are encouraged to file bug reports and re-distribute patches if you write them.

There have been free software projects based on the HTK like the free and open source HTS text to speech engine.

Solution

Simon never links to the HTK

The HTK is never directly linked to Simon but started as external program.

This way both the GPL and the HTK license can be fullfilled to the letter.

Simon is functional without the HTK

Since Simon 0.3 many features of Simon do no longer depend on the HTK at all.

Even without ever downloading the HTK the user can set up a completely working speech recognition system including custom use case scenarios, commands and many of the advanced features.

The only part of Simon that doesn't work without the HTK is the user specific acoustic model training.

The following example usage of Simon doesn't need the HTK:

  • User downloads Simon
  • User downloads Firefox, Amarok and window management scenario to control Mozilla Firefox, Amarok and his window manager
  • User downloads the Voxforge GPL acoustic model for English and sets it to be a static base model

The user can then control his computer through voice commands.

The following example usage of Simon still doesn't need the HTK:

  • The user would like to change the command "Start Browser" to "Launch Internet Explorer":
    • Adding the new words to the language model
    • Creating a new grammar structure allowing Simon to recognize this sentence
    • Recompiling the language model
    • Add a new command "Launch Internet Explorer" to start the browser.

The user can then use "Launch Internet Explorer" to start the browser.

On the other hand, the following example usage of Simon does require the HTK:

  • The recognition performance is poor so he decides to train the acoustic model some more through the integrated training procedure.

In short: Without the HTK Simon provides still much more control over the speech model than comparable open source solutions like GnomeVoiceControl which provides no way to change the acoustic model at all.

Speech models generated with the HTK can be free

Speech mdoels created with the HTK do not inherit the HTK license.

The HTK license does not limit the redistribution of the models created with it.

The license covers these parts (direct quote):

All source code, object or executable code, associated technical documentation and any data files in this HTK distribution.


Models created with the software are not covered by this license. In fact, the official FAQ clearly says so (direct quote):

Can I build & sell products based on HTK3?

Yes. You can for example use HTK3 to train models that are then used in your products.

There are free acoustic models available

As models created by the HTK can be free and open source, there are of course free acoustic models already available.

In fact this Wiki contains a list which Free and open source acoustic models that can be used with Simon.

The HTK file format is not proprietary

The models created by the HTK are use a quite simple ASCII file format that is very well documented.

You can for example create your models using the free and open source SPHINX speech recognition toolkit to compile an acoustic model and convert it to the HTK format.

There are multiple model converters between SPHINX and HTK models available (a quick google search turns up a lot more).

The HTK is interchangable

Simon is only very loosely bound to the HTK.

Technically the free and open source SPHINX speech recognition toolkit could replace both the HTK and Julius.

Simon has been designed for both these components to be interchangable.

The only reason why SPHINX support was not yet imported is that the documentation of SPHINX is - in my opinion - not as good as the one from the HTK and the University of Technology Graz that helps us with the speech recognition part of Simon has never used CMU SPHINX before.