I was pointed to an interesting article from WP Tavern called Give Up GitHub: The Time Has Come! on the Software Freedom Conservancy website. The Software Freedom Conservancy is a nonprofit organization centered around ethical technology. Their mission is to ensure the right to repair, improve and reinstall software at your choosing. They promote and defend these rights through fostering free and open source software (also known as FOSS) projects, driving initiatives that actively make technology more inclusive for everyone, and advancing policy strategies that defend FOSS (such as copyleft) for all.
Those who’ve forgotten history often unintentionally repeat it. Some of us in the developer world may recall that twenty-one years ago, the most popular code hosting site, a fully Free and Open Source (FOSS) site called SourceForge, had proprietarized all their code, never to make it FOSS again. All major FOSS projects slowly left SourceForge since it was in effect now, itself, a proprietary system, and contrary to the philosophy of FOSS. The FOSS communities learned that it was a mistake to allow a for-profit, proprietary software company to become the dominant FOSS collaborative development site for them to use. SourceForge slowly collapsed after the DotCom crash back in 2000, and still to this day, SourceForge refuses to solve these problems. We learned a valuable lesson that was a bit too easy to forget, especially when corporate involvement manipulates FOSS communities to its own purposes. We now must learn the SourceForge lesson again with Microsoft’s GitHub doing the same thing now.
GitHub has, over the last ten years, risen to dominate the FOSS development ecosystem. They did this by building a user interface and adding social interaction features to the existing Git technology. (For its part, Git was designed specifically to make software development distributed without a centralized site.) In the central irony of this, GitHub succeeded where SourceForge had failed: they have convinced us to promote and even aid in the creation of a proprietary system that exploits FOSS. GitHub profits from those proprietary products (sometimes from customers who use it for problematic activities). Specifically, GitHub profits primarily from those who wish to use GitHub tools for in-house proprietary software development. Yet, GitHub still comes out again and again seeming like a good actor in the end, because they point to their generosity in providing services to so many FOSS projects. But we’ve learned from the many free offerings in Big Tech: keep in mind if you aren’t the customer, you’re the product at the end of the day. The FOSS development methodology is GitHub’s product, which they’ve proprietarized and repackaged with our active, often unsuspecting, help.
FOSS developers have been for too long the proverbial frog in slowly boiling water. GitHub’s behaviour has gotten progressively worse over time, and we’ve excused, ignored, or otherwise consented to cognitive dissonance. The people at Software Freedom Conservancy have themselves been part of the problem, that is until recently, even they’d become too comfortable, complacent, and complicit with GitHub. Giving up GitHub will require work, sacrifice and may take a long time, even for them: the people at Software Freedom Conservancy historically self-hosted their primary Git repositories, but they did use GitHub as a mirror. They urged their member projects and community members to avoid GitHub (and all proprietary software development services and infrastructure altogether), but this was not enough. Today, they take a stronger stance. they are ending all their own uses of GitHub, and announcing a long-term plan to assist FOSS projects to migrate away from GitHub hoping to end their dominance. While they will not mandate their existing member projects to move at this time, they will no longer accept new member projects that do not have a long-term plan to migrate away from GitHub when given time to do so. And, they will provide resources to support any of their member projects that choose to migrate, and help them however they can.
They give so many good reasons to give up on GitHub, and they list the major ones on their Give Up On GitHub site that they have created. And, were already considering this action themselves for some time, but last week’s event showed that this action is long overdue.
Specifically, the people at Software Freedom Conservancy have been actively communicating with Microsoft and their GitHub subsidiary about their concerns with Copilot since they first launched it almost exactly a year ago. Their initial video chat call, in July 2021, with Microsoft and GitHub representatives resulted in several questions which they said they could not answer at that time, but would “answer soon”. After six months of no response, Bradley published his essay, If Software is My Copilot, Who Programmed My Software?, which raised these questions publicly. Still, GitHub did not answer their questions. Three weeks later, they launched a committee of experts to consider the moral implications of AI-assisted software, along with a parallel public discussion. They invited Microsoft and GitHub representatives to the public discussion, and they ignored their invitation, yet again not answering the questions. Last week, after they reminded GitHub of (a) the pending questions that they’d waited a year for them to answer and (b) of their refusal to join public discussion on the topic, they responded a week later, saying they would not join any public nor private discussion on this matter because “a broader conversation [about the ethics of AI-assisted software] seemed unlikely to alter your [SFC’s] stance, which is why we [GitHub] have not responded to your [SFC’s] detailed questions”. In other words, GitHub’s final position on Copilot is, if you disagree with GitHub about policy matters related to Copilot, then you don’t deserve a reply from Microsoft or GitHub at all. They only will bother to reply if they think they can immediately change your policy position to theirs. But, Microsoft and GitHub will leave you hanging for a year before they’ll tell you that!
Nevertheless, they were previously content to leave all this low on the priority list, after all, for its first year of existence, Copilot appeared to be more research prototype than actual product. All this changed last week when GitHub announced Copilot as a commercial, for-profit product. Launching a for-profit product that disrespects the FOSS community in the way Copilot does simply makes the weight of GitHub’s bad behavior too much to bear for developers.
Their three primary questions for Microsoft/GitHub (i.e., the questions they had been promising answers to them for a year, and that they now formally refused to answer) regarding Copilot were:
- What case law, if any, did you rely on in Microsoft & GitHub’s public claim, stated by GitHub’s (then) CEO, that: “(1) training ML systems on public data is fair use, (2) the output belongs to the operator, just like with a compiler”? In the interest of transparency and respect to the FOSS community, please also provide the community with your full legal analysis on why you believe that these statements are true. They think that they can now take Microsoft and GitHub’s refusal to answer as an answer of its own: they obviously stand by their former CEO’s statement (the only one they’ve made on the subject), and simply refuse to justify their unsupported legal theory to the community with actual legal analysis.
- If it is, as you claim, permissible to train the model (and allow users to generate code based on that model) on any code whatsoever and not be bound by any licensing terms, why did you choose to onlytrain Copilot’s model on FOSS? For example, why are your Microsoft Windows and Office codebases not in your training set? Microsoft and GitHub’s refusal to answer also hints at the real answer to this question, too: While GitHub gladly exploits FOSS inappropriately, they value their own “intellectual property” much more highly than FOSS, and are content to ignore and erode the rights of FOSS users but not their own.
- Can you provide a list of licenses, including names of copyright holders and/or names of Git repositories, that were in the training set used for Copilot? If not, why are you withholding this information from the community? It can only wildly speculated as to why they refuse to answer this question. However, good science practices would mean that they could answer that question in any event. (Good scientists take careful notes about the exact inputs to their experiments.) Since GitHub refuses to answer, their best guess is that they don’t have the ability to carefully reproduce their resulting model, so they don’t actually know the answer to whose copyrights they infringed and when and how.
As a result of GitHub’s bad actions, today the Software Freedom Conservancy calls on all FOSS developers to leave GitHub. They acknowledge that answering that call requires sacrifice and great inconvenience, and will take much time to accomplish. Yet, refusing GitHub’s services is the primary power developers have to send a strong message to GitHub and Microsoft about their bad behavior. GitHub’s business model has always been “proprietary vendor lock-in”. That’s the very behaviour FOSS was founded to curtail, and it’s why quitting necessary proprietary software in favour of a FOSS solution is often difficult. But remember, GitHub needs FOSS projects to use their proprietary infrastructure more than we need their proprietary infrastructure to build with. Alternatives exist, albeit with less familiar interfaces and on less popular websites, but we can also help improve those alternatives. And, if you join them, you will not be alone. Software Freedom Conservancy has launched a website, GiveUpGitHub, where they’ll provide tips, ideas, methods, tools and support to those that wish to leave GitHub with them. Watch that site and their blog throughout 2022 (and beyond!) for more on this subject.
Most importantly, they are committed to offering alternatives to projects that don’t yet have another place to go to. They will be announcing more hosting instance options, and a guide for replacing GitHub services in the coming weeks. If you’re ready to take on the challenge now and give up GitHub today, they note that CodeBerg, which is based on Gitea implements many (although not all) of GitHub. Thus, they’re also going to work on even more solutions, continue to vet other FOSS options, and publish and/or curate guides on, for example, how to deploy a self-hosted instance of the GitLab Community Edition.
Meanwhile, the work of their committee continues to carefully study the general question of AI-assisted software development tools. One recent preliminary finding was that AI-assisted software development tools can be constructed in a way that by-default respects FOSS licenses. They will continue to support the committee as they explore that idea further, and, they are actively monitoring this novel area of research. While Microsoft’s GitHub was the first mover in this area, by way of comparison, early reports suggest that Amazon’s new CodeWhisperer system, (also launched last week), seeks to provide proper attribution and licensing information for code suggestions.
This goes back to long-standing problems with GitHub, and the central reason why we must together give up on GitHub. They’ve seen with Copilot, with GitHub’s core hosting service, and in nearly every area of venture, GitHub’s behaviour is substantially worse than that of their peers. They don’t believe Amazon, Atlassian, GitLab, or any other for-profit hoster are perfect actors either. However, a relative comparison of GitHub’s behavior to those of its peers shows that GitHub’s behavior is much worse than others out there. GitHub also has a record of ignoring, dismissing and/or belittling community complaints on so many issues, that they must urge all FOSS developers to leave GitHub as soon as they can. Please, join them in their efforts to return to a world where FOSS is developed using FOSS.
Leave a Reply