Comparing perspectives: Trainer/Model and Input

Tue Jul 5 19:53:03 UTC 2022

While I dislike delving into the legal details on this list, I know that it's
unavoidable to sometimes touch them.  All too often, the legal details of how
copyleft works sometimes unduly biases our thinking about what is
morally/ethically right or wrong.  We must strive to keep them separate.

Nonetheless, it's an important point here that Joseph raises:

Joseph Turner wrote:

> Another thing to consider is an additional licensing restriction to the
> input code which states something to the effect of "if you use my code to
> produce a Model, then anything produced by the Model must also carry my
> code's license." IIUC, a clause like this would clarify the legal
> definition of a derivative work.

> The Model's license is still a relevant concern here, but I think it's
> interesting to consider the issue from both the perspective of the rights
> and wishes of author of software used as input to the Model and also that
> of the rights and wishes of the author of the Trainer/Model.

While Microsoft's GitHub and those that support their behavior want us to
believe that it's somehow “settled law” that the license of the inputs to an
AI training model have no impact on the license model itself (i.e., that, in
their view, under no circumstances does the model constitute a
combined/derivative work of the model's inputs) — their rhetoric is not
backed by proof, case law, or much of anything but bluster.

AI training issues are a novel area of legal issues.  It may well be the case
that the model is indeed a derivative work of its inputs.

One of there reasons I point to the GCC RTL in this discussion (and I'm
surprised no one here has picked up on this) was to hint at the potential
opposing side of my position — perhaps some people *do* think that an
AI-assist-training Additional Permission is needed *specifically* so that
what GitHub does at the training moment is permissible.  IOW, some may feel
that freedom zero / “field of use” demands that copyleft be narrowed in this
area of AI for moral/ethical reasons.  I'm not arguing that for myself — as I
think as an ethical and moral position (regardless of what the license says),
that AI models ought to be under copyleft.

I only bring up the obvious opposing argument explicitly to note that the
moral/ethical question must come first, and only *then* a discussion of what
copyleft currently does, and *then* a decision of whether copyleft rules
yield the right argument.

Regardless of what your position is on that issue, the more important point
is that we should avoid letting what is true, or is not true, or what might
be true/not-true about copyright law and/or copyleft decide the moral and
ethical questions for us.  Copyleft is a tool and strategy; not a principle.
--
Bradley M. Kuhn - he/him
Policy Fellow & Hacker-in-Residence at Software Freedom Conservancy
========================================================================
Become a Conservancy Sustainer today: https://sfconservancy.org/sustainer