June 28th, 2024

Microsoft's AI boss Suleyman has a curious understanding of web copyright law

Microsoft's AI boss, Mustafa Suleyman, suggests open web content is free to copy, sparking copyright controversy. AI firms debate fair use of copyrighted material for training, highlighting legal complexities and intellectual property concerns.

Read original articleLink Icon
Microsoft's AI boss Suleyman has a curious understanding of web copyright law

Microsoft's AI boss, Mustafa Suleyman, has sparked controversy by suggesting that content on the open web is fair game for anyone to copy and use freely, dubbing it "freeware." This belief contradicts copyright law, as creating content automatically grants copyright protection in the US. Suleyman's stance came to light during a discussion about AI companies allegedly using copyrighted online material to train AI models. Despite the legal protection of content, some AI firms argue that training on copyrighted material falls under fair use, a defense typically determined by a court. Suleyman's comments have drawn criticism for their disregard of copyright laws and the implications for intellectual property rights. Additionally, he touched on the concept of robots.txt files as a means to prevent content scraping, highlighting the ongoing legal complexities surrounding web content usage. The debate underscores the challenges posed by AI advancements and the need for a nuanced understanding of intellectual property rights in the digital age.

Related

OpenAI and Anthropic are ignoring robots.txt

OpenAI and Anthropic are ignoring robots.txt

Two AI startups, OpenAI and Anthropic, are reported to be disregarding robots.txt rules, allowing them to scrape web content despite claiming to respect such regulations. TollBit analytics revealed this behavior, raising concerns about data misuse.

We need an evolved robots.txt and regulations to enforce it

We need an evolved robots.txt and regulations to enforce it

In the era of AI, the robots.txt file faces limitations in guiding web crawlers. Proposals advocate for enhanced standards to regulate content indexing, caching, and language model training. Stricter enforcement, including penalties for violators like Perplexity AI, is urged to protect content creators and uphold ethical AI practices.

RIAA of Six Years Ago Debunks RIAA of Today's AI Lawsuit Claims

RIAA of Six Years Ago Debunks RIAA of Today's AI Lawsuit Claims

The RIAA is suing AI music services Suno and Udio for alleged copyright infringement, sparking debate over fair use and implications for the AI industry and copyright law. Critics question the RIAA's motives.

Copyright Takedowns: A Cautionary Tale

Copyright Takedowns: A Cautionary Tale

The article delves into fair use complexities in copyright law, citing the Blurred Lines case. It discusses challenges with automated takedowns by systems like Content ID, emphasizing the struggle for content creators against entities like Universal Music Group. It raises concerns about filternets' impact on free expression, advocating for a balanced copyright enforcement approach.

Microsoft says that it's okay to steal web content it because it's 'freeware.'

Microsoft says that it's okay to steal web content it because it's 'freeware.'

Microsoft's CEO of AI, Mustafa Suleyman, believes web content is "freeware" for AI training unless specified otherwise. This stance has sparked legal disputes and debates over copyright infringement and fair use in AI content creation.

Link Icon 1 comments
By @archontes - 5 months
When you write something on the internet, you automatically obtain a copyright on it.

Copyright provides the exclusive rights to reproduce, adapt, publish, perform, and display that thing.

Training an AI model isn't any of those things.

If you transmit a thing to me, and I have those bits on my computer, you don't get to determine that I can't train an AI on it, unless we signed an agreement further restricting my use prior to you transmitting it to me.

Now. My AI might produce a work that is sufficiently similar to your work that it is considered a reproduction or adaptation, but that doesn't mean that the training was an infringement.

Also, courts have repeatedly held that webscraping is entirely legal.

If you don't want folks (or their computers) learning from things you create, don't put them on the internet.

NOW for the hilarious follow-on: Copyright is not granted for the results of an automatic process. Training an AI is an automatic process, and it's plausible that attempting to claim copyright on model weights would fail if it were litigated fully. It's more likely they'd qualify for trade secret protection.