Is It Time To Give Up GitHub?

Those who’ve forgotten history often unintentionally repeat it. Some of us in the developer world may recall that twenty-one years ago, the most popular code hosting site, a fully Free and Open Source (FOSS) site called SourceForge, had proprietarized all their code, never to make it FOSS again. All major FOSS projects slowly left SourceForge since it was in effect now, itself, a proprietary system, and contrary to the philosophy of FOSS. The FOSS communities learned that it was a mistake to allow a for-profit, proprietary software company to become the dominant FOSS collaborative development site for them to use. SourceForge slowly collapsed after the DotCom crash back in 2000, and still to this day, SourceForge refuses to solve these problems. We learned a valuable lesson that was a bit too easy to forget, especially when corporate involvement manipulates FOSS communities to its own purposes. We now must learn the SourceForge lesson again with Microsoft’s GitHub doing the same thing now.

A parody of the GitHub logo, walling off user rights and demanding payment
image from article posted on the Software Freedom Conservancy website

Their three primary questions for Microsoft/GitHub (i.e., the questions they had been promising answers to them for a year, and that they now formally refused to answer) regarding Copilot were:

  1. What case law, if any, did you rely on in Microsoft & GitHub’s public claim, stated by GitHub’s (then) CEO, that: “(1) training ML systems on public data is fair use, (2) the output belongs to the operator, just like with a compiler”? In the interest of transparency and respect to the FOSS community, please also provide the community with your full legal analysis on why you believe that these statements are true. They think that they can now take Microsoft and GitHub’s refusal to answer as an answer of its own: they obviously stand by their former CEO’s statement (the only one they’ve made on the subject), and simply refuse to justify their unsupported legal theory to the community with actual legal analysis.
  2. If it is, as you claim, permissible to train the model (and allow users to generate code based on that model) on any code whatsoever and not be bound by any licensing terms, why did you choose to onlytrain Copilot’s model on FOSS? For example, why are your Microsoft Windows and Office codebases not in your training set? Microsoft and GitHub’s refusal to answer also hints at the real answer to this question, too: While GitHub gladly exploits FOSS inappropriately, they value their own “intellectual property” much more highly than FOSS, and are content to ignore and erode the rights of FOSS users but not their own.
  3. Can you provide a list of licenses, including names of copyright holders and/or names of Git repositories, that were in the training set used for Copilot? If not, why are you withholding this information from the community? It can only wildly speculated as to why they refuse to answer this question. However, good science practices would mean that they could answer that question in any event. (Good scientists take careful notes about the exact inputs to their experiments.) Since GitHub refuses to answer, their best guess is that they don’t have the ability to carefully reproduce their resulting model, so they don’t actually know the answer to whose copyrights they infringed and when and how.

Most importantly, they are committed to offering alternatives to projects that don’t yet have another place to go to. They will be announcing more hosting instance options, and a guide for replacing GitHub services in the coming weeks. If you’re ready to take on the challenge now and give up GitHub today, they note that CodeBerg, which is based on Gitea implements many (although not all) of GitHub. Thus, they’re also going to work on even more solutions, continue to vet other FOSS options, and publish and/or curate guides on, for example, how to deploy a self-hosted instance of the GitLab Community Edition.

This goes back to long-standing problems with GitHub, and the central reason why we must together give up on GitHub. They’ve seen with Copilot, with GitHub’s core hosting service, and in nearly every area of venture, GitHub’s behaviour is substantially worse than that of their peers. They don’t believe Amazon, Atlassian, GitLab, or any other for-profit hoster are perfect actors either. However, a relative comparison of GitHub’s behavior to those of its peers shows that GitHub’s behavior is much worse than others out there. GitHub also has a record of ignoring, dismissing and/or belittling community complaints on so many issues, that they must urge all FOSS developers to leave GitHub as soon as they can. Please, join them in their efforts to return to a world where FOSS is developed using FOSS.







Leave a Reply