From bkuhn at sfconservancy.org Mon Sep 26 17:33:20 2022 From: bkuhn at sfconservancy.org (Bradley M. Kuhn) Date: Mon, 26 Sep 2022 10:33:20 -0700 Subject: Minutes for AI Assist Committee on Tuesday 2022-07-18, 18:00-19:00 UTC Message-ID: <87a66mca73.fsf@ebb.org> We make minutes from the Committee meetings public on this list. We welcome discussion from the public, and some members of the Committee are watching this mailing list and will bring useful points raised in public discussion back to the committee. We apologize that while the Committee did meet through the summer, production of minutes was slowed. We ask that when you reply to the list, be mindful to keep the subject line descriptive. A long thread with ?Re: Minutes of the AI Assist Committee? is going to be difficult to follow the different threads of response that come from the Committee's minutes. Thus, while using threading (via In-Reply-To:, References: and other RFC-5322-compliant headers) is still useful, making sure that you change Subject: to match the content is much appreciated for delineation of conversation. Thanks! BEGIN MINUTES, AI Assist Committee on Tuesday 2022-07-18, 18:00-19:00 UTC Summary of discussion in public list during June ================================================ A rather stereotypical discussion about the freedom to run and there was discussion of field of use restrictions. There was some concern that treating the Open Source Definition (OSD) or the Free Software Definition (FSD) as sacrosanct might limit our thinking in what options might be available. Specifically, the question of whether or not we should object, as a moral issue, APAS's that help users write proprietary software. Is it a violation of the freedom to run to prohibit users from using your APAS to create more proprietary software? Is it a field-of-use restriction? Indeed, some consider copyleft itself a field-of-use restriction since it *does* restrict certain uses of the software, and the OSD effectively grandfathers copyleft in. However, this discussion may not be helpful, and reiterates previous discussion from the prior meeting and on the public mailing list. There is really no surprise that the conversation goes in this way, since the debate is very old, and the debate is focused around what is ?freedom of use? ? copyleft has always been a question of global and universal freedom of use vs. the one-time freedom of use by an individual user. This may not really touch on this debate on how this plays out for AI tools. There *are* requirements on all FOSS licenses, even if the AI tools are copyrightable and the output is based on the copyright work. Should Copyleft Licenses Address APAS Directly? =============================================== But, should we recommend that copyleft licenses in future speak to the issue? Should they create the so-called ?field of use? restriction that if the model is used to create. In other words, is copyright minimalism a central approach of copyleft. Would the OSI call it a ?field of use? restriction if a copyleft license restricted creation of proprietary software through the AI system? Copyleft really isn't a ?field of use? restriction, since it's not like using it. There is no explicit statement that it should only apply to copyright. These licenses do talk about trademark and patent. However, we should focus on when it creates a identical copy. Were the additional permission on Bison and GCC that allow to use the runtimes and/or program output in proprietary software? Was this merely a practical: ?would anyone adopt Bison or GCC if you could only make copylefted software with them?? or, ?is there a *moral* reason that GCC and Bison *must* have an exception that allow the creation of proprietary software?? This question really is akin to the questions asked during the drafting of the Affero clause, and may be relevant here. The Affero clause hooks on copyright but it is an additional requirement not in previous copylefts. Google did lobby OSI to determine the Affero clause to be a ?field of endeavor? restriction. (Ultimately, OSI did approve Affero GPL.) Suppose we wrote in to a copyleft license that everything in APAS had to be free. Would that be wrong? This may be too strict. So suppose the APAS helps to index and API, and then both FOSS and proprietary re-implementations that API can be made with the APAS . We'd want to allow that for everyone. Also, a license that addresses this probably wouldn't drive adoption. The question is whether that license is merely clarifying what's already true about copyleft licenses or is it requiring something new. Consider Affero clause, which is a new requirement, whereas the patent requirements in GPLv3 were merely clarifying what was already implicit. Philosophically, it may well be better to focus only on the traditional copyright controls: is this the output a copy or a derivative work? We are also not the only ones trying to understand. Copyright Maximalism vs. Copyleft Maximalism? ============================================= There was further discussion about the question of whether any effort to raise concern about APAS's ultimately is a copyright maximalist point of view. We considered the question of whether anyone really disagrees with the idea if output that is copyrightable and is a verbatim copyright and/or a derivative work: does anyone (even the strongest copyright minimalist) really believe that APAS ?washes away? the requirements (as GitHub's position has been)? There have been many music copyright cases that indicate that even unintentional derivative works by other artists are in fact copyright infringement. This seems analogous here. Does copyright minimalism de-fact mean copyleft minimialism? The Affero clause is a good example here: it hooks on modify, not the public performance right. However, modification is easily covered, and the provision of source code to the public who is ?received performance? of the software, but the hook is not really public performance, it's modification. Similarly, if we hooked expanding copyleft on modify that says ?APAS's must follow a certain set of rules if you modify the software? ? does that make it a ?field of endeavor? ? This is a cheeky proposal because it relies on the ambiguities in copyright. Another approach might be say that copyleft requirements are triggered in APAS reproduction *even if* copyright doesn't cover. This would likely be a contract term, not a copyright term, but it would be head toward copyleft maximalism even if copyright minimalism is maintained. Should we try to cover these issue with license terms? We are confident we could do it now with contract terms, but does it take the licenses in a direction that's problematic? The question is not really whether this would *work* (it probably would), but is it morally acceptable? Was it morally required that the only penalties and hooks? Can you uphold software freedom with contracts? What if the contract said that you can't reimplement an API? Is that something we want? Do we find that morally acceptable? (Obviously we don't.) So, we may need to be somewhat contractually minimalist when looking at copyleft too. This may be worth exploring but there are likely unintended consequences, so it's unlikely to be a short-term solution to the problem of APAS's. There may be no bright line between a machine learning system for indexing code for searching and indexing it for generating similar code. It's hard to imagine that, given the music cases, surely a Court would say that if the APAS output was substantially similar would not consider it infringement. There are folks who are arguing that in the UK, that there are no restrictions whatsoever, but these claims seem dubious and they may be considering that AI systems only take data, not copyrighted works, as input. Other Approaches To Calling APAS trained on FOSS Into Question ============================================================== Should we approach this as an arms race, where we continue to show places where these APAS's produce infringing works, and the APAS creators keep modifying their systems (as GitHub already has been doing) to blacklist production of that particular output. Ultimately, if we don't see these systems regularly producing infringing works, then our complaints become less valid. We do probably need to research directly to find out of these systems can produce infringing works. It would be valuable to show that these system produce infringing works. But what is the priority though? Do we create the license-respecting APAS first, or do we focus resources in proving infringement by these systems? Can we rely on Amazon's system that claims to respect licenses? Is it worth do that work? Is Copilot actually useful? It could be a flash in the pan, so if we put effort into proving Copilot infringes may not regularly help to prove its infringement. END MINUTES, AI Assist Committee on Tuesday 2022-07-18, 18:00-19:00 UTC -- Bradley M. Kuhn - he/him Policy Fellow & Hacker-in-Residence at Software Freedom Conservancy ======================================================================== Become a Conservancy Sustainer today: https://sfconservancy.org/sustainer From dirk at hohndel.org Mon Sep 26 18:54:52 2022 From: dirk at hohndel.org (Dirk Hohndel) Date: Mon, 26 Sep 2022 11:54:52 -0700 Subject: new licenses aren't going to solve the problem In-Reply-To: <87a66mca73.fsf@ebb.org> References: <87a66mca73.fsf@ebb.org> Message-ID: <31C036C8-6304-4793-B1AF-07CCDBBF790F@hohndel.org> Bradley, Thanks for those notes. I'm trying to follow your instructions regarding responses... > On Sep 26, 2022, at 10:33 AM, Bradley M. Kuhn wrote: > > Should Copyleft Licenses Address APAS Directly? > =============================================== > > But, should we recommend that copyleft licenses in future speak to the > issue? Should they create the so-called ?field of use? restriction that if > the model is used to create. In other words, is copyright minimalism a > central approach of copyleft. Would the OSI call it a ?field of use? > restriction if a copyleft license restricted creation of proprietary > software through the AI system? Doesn't this approach lead to the outcome that 40+ years of free and open source software that uses existing licenses is already accepted as being open for ingestion into models? In other words, if we focus on creating NEW licenses that try to prevent the code in question to be reused in AI models, wouldn't that create the clear impression that we are in agreement that none of the existing licenses protect from that? I had really hoped that at least for the reciprocal licenses good arguments could be made that this isn't as straight forward as some seem to think. And that also some of the protections that one could get from licenses (for example against patent claims) would be lost in such cases? Or am I simply aiming my thinking in the wrong direction? > Other Approaches To Calling APAS trained on FOSS Into Question > ============================================================== > > Should we approach this as an arms race, where we continue to show places > where these APAS's produce infringing works, and the APAS creators keep > modifying their systems (as GitHub already has been doing) to blacklist > production of that particular output. > > Ultimately, if we don't see these systems regularly producing infringing > works, then our complaints become less valid. Yes, that is similar to my concern - but kind of taking the opposite position. By getting them to specifically challenge code and remove it, we can make it harder to object - just like I think that by creating new licenses we can make it harder to object. /D PS: I tried to massively cut this email down and may have inadvertently removed important passages - this was done to prevent flooding / drowning the recipients, not in an effort to hide points that were already made.