Should there be a clause for AI?

Fri Jul 11 16:31:23 UTC 2025

On 11/7/25 18:02, Aaron Wolf wrote:
 > That "Training Public License" seems to have a potential flaw by my
 > first read. It says " you must release all resulting models, weights,
 > and related code under the GPLv3 or later" but what does "release" mean?
 > There's no Affero-type clause, so keeping the AI running on private
 > servers even if giving others access might not count.

Very true, honestly this is just my attempt at the "preventative" 
interpretation of the GPLv3-or-later, where it works as a deterent. In 
that viewpoint, I just wanted a patchwork while the ongoing dialog 
progressed, so it is neither stated properly, nor is a "good" license. 
It is meant however to deter, and to that extent I think it sort of 
works okay as a patch.

On 7/11/25 7:31, Ben Cotton wrote:
 > Would this be field of use discrimination? I think it would, and would
 > therefore violate FSF Freedom 0 and OSD criteria 6. That would render
 > copyleft-next neither Open Source nor Free in the capital-letter sense
 > of the terms.

Honestly I believe it is a matter of view. For me it is inclusive rather 
than exclusive; "Copyleft holds for these use cases, and also for this 
one." A clause like that could act as a clarification; in my opinion 
this sort of works like the AGPL. One could argue that the AGPL is also 
discriminatory, same for the LGPL, or for the GPL with linking 
exception, etc.

 > At least with traditional software,
 > there's typically some external evidence that a project was used in
 > violation of its license. I'm not sure there's a good way to detect it
 > in an LLM unless I make my training data public (which I would not if
 > I were intending to violate the license).

Again, as I said before, I believe that copyleft works in two ways; as a 
deterent and as an enabler. This clarification clause would work as a 
deterent. Much like you need to prove the GPL violation, you would also 
need to prove this, however I don't think this poses such an 
insurmountable issue. For one, I can get a verbatim snippet of GPL 
licensed code. There are also active legal cases in the USA about LLMs 
being trained on copyrighted data, for example OpenAI vs New York Times, 
so there is tangible proof that there are ways to figure out if such 
models have used code licensed in such a way that would require them to 
be open (open as in open, not as in OSAID open).

- Vasileios Valatsos