Judge dismisses majority of GitHub Copilot copyright claims
A judge dismissed most copyright claims against GitHub, Microsoft, and OpenAI regarding GitHub Copilot, leaving two active claims while rejecting DMCA violations and preventing refiling of dismissed claims.
Read original articleA judge has dismissed most of the copyright claims in a lawsuit against GitHub, Microsoft, and OpenAI regarding the AI-powered coding assistant, GitHub Copilot. The lawsuit, initiated by developers in 2022, originally included 22 claims alleging copyright violations. Judge Jon Tigar's ruling, which was unsealed recently, leaves only two claims active: one concerning an open-source license violation and another related to breach of contract. The court dismissed the primary allegation that GitHub Copilot violated the Digital Millennium Copyright Act (DMCA) by suggesting code without proper attribution. The judge found the developers' arguments unconvincing, stating that the code in question was not sufficiently similar to the original works and noted that GitHub Copilot rarely reproduces memorized code. Consequently, the judge dismissed the DMCA claim with prejudice, preventing the developers from refiling it. Additionally, requests for punitive damages and monetary relief were also dismissed. Despite this ruling, the legal battle continues with the remaining claims likely to proceed through litigation, highlighting the ongoing legal complexities surrounding AI coding assistants and their training on existing codebases.
- Most copyright claims against GitHub Copilot have been dismissed.
- Only two claims remain: one for open-source license violation and one for breach of contract.
- The judge found the arguments regarding DMCA violations unconvincing.
- The ruling prevents the developers from refiling the dismissed claims.
- The case underscores the legal challenges faced by AI-powered coding tools.
Related
Coders' Copilot code-copying copyright claims crumble against GitHub, Microsoft
A judge dismissed a DMCA claim against GitHub, Microsoft, and OpenAI over Copilot. Remaining are claims of license violation and breach of contract. Dispute ongoing regarding discovery process. Defendants defend Copilot's compliance with laws.
Judge dismisses DMCA copyright claim in GitHub Copilot suit
A judge dismissed a DMCA claim against GitHub, Microsoft, and OpenAI over Copilot. The lawsuit alleged code suggestions lacked proper credit. Remaining claims involve license violation and breach of contract. Both sides dispute document production.
The developers suing over GitHub Copilot got dealt a major blow in court
A California judge dismissed most claims in a lawsuit against GitHub, Microsoft, and OpenAI over code copying by GitHub Copilot. Only two claims remain: open-source license violation and breach of contract. The court ruled Copilot didn't violate copyright law.
Judge dismisses lawsuit over GitHub Copilot coding assistant
A US judge dismissed a lawsuit against GitHub over AI training with public code. Plaintiffs failed to prove damages for breach of contract. GitHub Copilot faces scrutiny for using open-source code.
GitHub Copilot is not infringing your copyright
GitHub Copilot, an AI tool, faces controversy for using copyleft-licensed code for training. Debate surrounds copyright infringement, AI-generated works, and implications for tech industry and open-source principles.
Everything was going great and I had a working example, so I decided to look online for some example code to verify I was doing things correctly, and not making any glaring mistakes. It was then that I found an exact line by line copy of what chat gpt had given me. This was before it had the ability to google things, and the code predated openAI. It had even brought across spelling errors in the variables, the only thing it changed was it translated the comments from Spanish to English.
I had always been under the impression that chat gpt just learned from sources, and then gave you a new result based roughly on its sources. I think some of the confounding variables here were, 1. this was a very specific use case and not many examples existed, and 2. all opengl code looks similar, to a point.
The worst part was, there was no license provided for the code or the repo, so it was not legal for me to take the code wholesale like that. I am now much more cautious about asking chat gpt for code, I only have it give me direction now, and no longer use 'sample code' that it produces.
In particular this:
An amended version of the complaint had taken issue with GitHub’s duplication detection filter, which allows users to “detect and suppress” Copilot suggestions matching public code on GitHub.
The developers argued that turning off this filter would “receive identical code” and cited a study showing how AI models can “memorise” and reproduce parts of their training data, potentially including copyrighted code.
However, Judge Tigar found these arguments unconvincing. He determined that the code allegedly copied by GitHub was not sufficiently similar to the developers’ original work. The judge also noted that the cited study itself mentions that GitHub Copilot “rarely emits memorised code in benign situations.”
I think this is the key point: reproduction is the issue, not training. And as noted in the study[1] reproduction doesn't usually happen unless you go to extra lengths to make it.
[1] Not sure but maybe https://dl.acm.org/doi/abs/10.1145/3597503.3639133? Can anyone find the filing?
Copyright does not restrict reading a book or watching a movie. Copyright also does not restrict access to a work. It only restricts duplication without express authorization. As for computer data the restricted duplication typically refers to dedicated storage, such as storage on disk as opposed to storage in CPU cache.
When Viacom sued YouTube for $1.6 billion they were trying to halt the public from accessing their content on YouTube. They only sued YouTube, not YouTube users, and only because YouTube stored Viacom IP without permission.
Discussion from July:
Judge dismisses DMCA copyright claim in GitHub Copilot suit
Contract is understandable - it supersedes almost everything else. If the law says I can do X but the contract says I can't, then I almost certainly can't.
It's nice to see open-source licenses being treated as having somewhat similar solidness as a contract.
these seem to be major claims?
I was lucky to learn early-on that publishing important things to the web meant relinquishing control of not just the IP, but my own agency and fate. The cost far exceeded the benefits of generosity, be it contributions to FOSS, public blogging or documentation, or even just writing.
Time is the only fixed resource, and mine is proprietary, exclusive, and for sale to the highest bidder.
"Sciences" refers not only to fields of modern scientific inquiry but rather to all knowledge
The hacker ethic is a philosophy and set of moral values within hacker culture. Practitioners believe that sharing information and data with others is an ethical imperative
hrmmm...
We need new laws. Especially regarding deepfakes, it's shocking how many people think revenge porn laws and such are going to be enough here. Rather than just focusing on the data usage, we need more fundamental laws and rights, like the right to control representations of ourselves, like Japan has, where producing images or voice/video in your likeness is prosecutable straight out. Likewise we need laws that explicitly target data use for training that is separate to copyright.
The way LLMs are trained is obviously too similar to how humans learn, and the transformation and then output produce works that are novel based on that "learning", just like humans do. This is so fundamentally different to what copyright laws were made to cover, I find it infuriating how many people handwave these arguments away. Only in perfect 1-to-1 regurgitation does it even feel close to something copyright would be able to cover.
I guess Microsoft has gotten what it wanted and has got to the extinguish stage of its plan for open source finally and all it needed was a chatbot.
I don't know anything an LLM (or "AI") can do that a human couldn't, with enough time. If it can get a human in trouble, it should get the operators of the AI in trouble too. Likewise, if a human can do it, I don't see why an AI is any different.
Related
Coders' Copilot code-copying copyright claims crumble against GitHub, Microsoft
A judge dismissed a DMCA claim against GitHub, Microsoft, and OpenAI over Copilot. Remaining are claims of license violation and breach of contract. Dispute ongoing regarding discovery process. Defendants defend Copilot's compliance with laws.
Judge dismisses DMCA copyright claim in GitHub Copilot suit
A judge dismissed a DMCA claim against GitHub, Microsoft, and OpenAI over Copilot. The lawsuit alleged code suggestions lacked proper credit. Remaining claims involve license violation and breach of contract. Both sides dispute document production.
The developers suing over GitHub Copilot got dealt a major blow in court
A California judge dismissed most claims in a lawsuit against GitHub, Microsoft, and OpenAI over code copying by GitHub Copilot. Only two claims remain: open-source license violation and breach of contract. The court ruled Copilot didn't violate copyright law.
Judge dismisses lawsuit over GitHub Copilot coding assistant
A US judge dismissed a lawsuit against GitHub over AI training with public code. Plaintiffs failed to prove damages for breach of contract. GitHub Copilot faces scrutiny for using open-source code.
GitHub Copilot is not infringing your copyright
GitHub Copilot, an AI tool, faces controversy for using copyleft-licensed code for training. Debate surrounds copyright infringement, AI-generated works, and implications for tech industry and open-source principles.