Minutes for AI Assist Committee on Tuesday 2022-07-18, 18:00-19:00 UTC

Bradley M. Kuhn bkuhn at sfconservancy.org
Mon Sep 26 17:33:20 UTC 2022


We make minutes from the Committee meetings public on this list.  We welcome
discussion from the public, and some members of the Committee are watching
this mailing list and will bring useful points raised in public discussion
back to the committee.  We apologize that while the Committee did meet
through the summer, production of minutes was slowed.

We ask that when you reply to the list, be mindful to keep the subject line
descriptive.  A long thread with “Re: Minutes of the AI Assist Committee” is
going to be difficult to follow the different threads of response that come
from the Committee's minutes.  Thus, while using threading (via
In-Reply-To:, References: and other RFC-5322-compliant headers) is still
useful, making sure that you change Subject: to match the content is much
appreciated for delineation of conversation.  Thanks!

BEGIN MINUTES, AI Assist Committee on Tuesday 2022-07-18, 18:00-19:00 UTC

Summary of discussion in public list during June
================================================

A rather stereotypical discussion about the freedom to run and there was
discussion of field of use restrictions.  There was some concern that
treating the Open Source Definition (OSD) or the Free Software Definition
(FSD) as sacrosanct might limit our thinking in what options might be
available.

Specifically, the question of whether or not we should object, as a moral
issue, APAS's that help users write proprietary software.  Is it a violation
of the freedom to run to prohibit users from using your APAS to create more
proprietary software?  Is it a field-of-use restriction?  Indeed, some
consider copyleft itself a field-of-use restriction since it *does* restrict
certain uses of the software, and the OSD effectively grandfathers copyleft
in.

However, this discussion may not be helpful, and reiterates previous
discussion from the prior meeting and on the public mailing list. 


There is really no surprise that the conversation goes in this way, since
the debate is very old, and the debate is focused around what is “freedom of
use” — copyleft has always been a question of global and universal freedom
of use vs. the one-time freedom of use by an individual user.

This may not really touch on this debate on how this plays out for AI
tools.  There *are* requirements on all FOSS licenses, even if the AI tools
are copyrightable and the output is based on the copyright work.


Should Copyleft Licenses Address APAS Directly?
===============================================

But, should we recommend that copyleft licenses in future speak to the
issue?  Should they create the so-called “field of use” restriction that if
the model is used to create.  In other words, is copyright minimalism a
central approach of copyleft.  Would the OSI call it a “field of use”
restriction if a copyleft license restricted creation of proprietary
software through the AI system?

Copyleft really isn't a “field of use” restriction, since it's not like
using it.  There is no explicit statement that it should only apply to
copyright.  These licenses do talk about trademark and patent.  However, we
should focus on when it creates a identical copy.

Were the additional permission on Bison and GCC that allow to use the
runtimes and/or program output in proprietary software?  Was this merely a
practical: “would anyone adopt Bison or GCC if you could only make
copylefted software with them?”  or, “is there a *moral* reason that GCC and
Bison *must* have an exception that allow the creation of proprietary
software?”  This question really is akin to the questions asked during the
drafting of the Affero clause, and may be relevant here.  The Affero clause
hooks on copyright but it is an additional requirement not in previous
copylefts.  Google did lobby OSI to determine the Affero clause to be a
“field of endeavor” restriction.  (Ultimately, OSI did approve Affero GPL.)

Suppose we wrote in to a copyleft license that everything in APAS had to be
free.  Would that be wrong?

This may be too strict.  So suppose the APAS helps to index and API, and
then both FOSS and proprietary re-implementations  that API can be made with
the APAS . We'd want to allow that for everyone.

Also, a license that addresses this probably wouldn't drive adoption.

The question is whether that license is merely clarifying what's already
true about copyleft licenses or is it requiring something new.  Consider
Affero clause, which is a new requirement, whereas the patent requirements
in GPLv3 were merely clarifying what was already implicit.

Philosophically, it may well be better to focus only on the traditional
copyright controls: is this the output a copy or a derivative work?

We are also not the only ones trying to understand.

Copyright Maximalism vs. Copyleft Maximalism?
=============================================

There was further discussion about the question of whether any effort to
raise concern about APAS's ultimately is a copyright maximalist point of
view.

We considered the question of whether anyone really disagrees with the idea
if output that is copyrightable and is a verbatim copyright and/or a
derivative work: does anyone (even the strongest copyright minimalist)
really believe that APAS “washes away” the requirements (as GitHub's
position has been)?

There have been many music copyright cases that indicate that even
unintentional derivative works by other artists are in fact copyright
infringement.  This seems analogous here.

Does copyright minimalism de-fact mean copyleft minimialism?  The Affero
clause is a good example here: it hooks on modify, not the public
performance right.  However, modification is easily covered, and the
provision of source code to the public who is “received performance” of the
software, but the hook is not really public performance, it's modification.

Similarly, if we hooked expanding copyleft on modify that says “APAS's must
follow a certain set of rules if you modify the software” … does that make
it a “field of endeavor” ?

This is a cheeky proposal because it relies on the ambiguities in copyright.

Another approach might be say that copyleft requirements are triggered in
APAS reproduction *even if* copyright doesn't cover.  This would likely be a
contract term, not a copyright term, but it would be head toward copyleft
maximalism even if copyright minimalism is maintained.

Should we try to cover these issue with license terms?  We are confident we
could do it now with contract terms, but does it take the licenses in a
direction that's problematic?

The question is not really whether this would *work* (it probably would),
but is it morally acceptable?  Was it morally required that the only
penalties and hooks?  Can you uphold software freedom with contracts?

What if the contract said that you can't reimplement an API?  Is that
something we want?  Do we find that morally acceptable?  (Obviously we
don't.)  So, we may need to be somewhat contractually minimalist when
looking at copyleft too.

This may be worth exploring but there are likely unintended consequences, so
it's unlikely to be a short-term solution to the problem of APAS's.

There may be no bright line between a machine learning system for indexing
code for searching and indexing it for generating similar code.

It's hard to imagine that, given the music cases, surely a Court would say
that if the APAS output was substantially similar would not consider it
infringement.

There are folks who are arguing that in the UK, that there are no
restrictions whatsoever, but these claims seem dubious and they may be
considering that AI systems only take data, not copyrighted works, as input.

Other Approaches To Calling APAS trained on FOSS Into Question
==============================================================

Should we approach this as an arms race, where we continue to show places
where these APAS's produce infringing works, and the APAS creators keep
modifying their systems (as GitHub already has been doing) to blacklist
production of that particular output.

Ultimately, if we don't see these systems regularly producing infringing
works, then our complaints become less valid.

We do probably need to research directly to find out of these systems can
produce infringing works.  It would be valuable to show that these system
produce infringing works.

But what is the priority though?  Do we create the license-respecting APAS
first, or do we focus resources in proving infringement by these systems?

Can we rely on Amazon's system that claims to respect licenses?  Is it worth
do that work?

Is Copilot actually useful?  It could be a flash in the pan, so if we put
effort into proving Copilot infringes may not regularly help to prove its
infringement.

END MINUTES, AI Assist Committee on Tuesday 2022-07-18, 18:00-19:00 UTC


-- 
Bradley M. Kuhn - he/him
Policy Fellow & Hacker-in-Residence at Software Freedom Conservancy
========================================================================
Become a Conservancy Sustainer today: https://sfconservancy.org/sustainer


More information about the ai-assist mailing list