Should there be a clause for AI?
Vasileios Valatsos
me at aethrvmn.gr
Fri Jul 11 16:31:23 UTC 2025
On 11/7/25 18:02, Aaron Wolf wrote:
> That "Training Public License" seems to have a potential flaw by my
> first read. It says " you must release all resulting models, weights,
> and related code under the GPLv3 or later" but what does "release" mean?
> There's no Affero-type clause, so keeping the AI running on private
> servers even if giving others access might not count.
Very true, honestly this is just my attempt at the "preventative"
interpretation of the GPLv3-or-later, where it works as a deterent. In
that viewpoint, I just wanted a patchwork while the ongoing dialog
progressed, so it is neither stated properly, nor is a "good" license.
It is meant however to deter, and to that extent I think it sort of
works okay as a patch.
On 7/11/25 7:31, Ben Cotton wrote:
> Would this be field of use discrimination? I think it would, and would
> therefore violate FSF Freedom 0 and OSD criteria 6. That would render
> copyleft-next neither Open Source nor Free in the capital-letter sense
> of the terms.
Honestly I believe it is a matter of view. For me it is inclusive rather
than exclusive; "Copyleft holds for these use cases, and also for this
one." A clause like that could act as a clarification; in my opinion
this sort of works like the AGPL. One could argue that the AGPL is also
discriminatory, same for the LGPL, or for the GPL with linking
exception, etc.
> At least with traditional software,
> there's typically some external evidence that a project was used in
> violation of its license. I'm not sure there's a good way to detect it
> in an LLM unless I make my training data public (which I would not if
> I were intending to violate the license).
Again, as I said before, I believe that copyleft works in two ways; as a
deterent and as an enabler. This clarification clause would work as a
deterent. Much like you need to prove the GPL violation, you would also
need to prove this, however I don't think this poses such an
insurmountable issue. For one, I can get a verbatim snippet of GPL
licensed code. There are also active legal cases in the USA about LLMs
being trained on copyrighted data, for example OpenAI vs New York Times,
so there is tangible proof that there are ways to figure out if such
models have used code licensed in such a way that would require them to
be open (open as in open, not as in OSAID open).
- Vasileios Valatsos
More information about the next
mailing list