Microsoft CEO of AI Your online content is 'freeware' fodder for training models
Mustafa Suleyman, CEO of Microsoft AI, faced legal action for using online content as "freeware" to train neural networks. The debate raises concerns about copyright, AI training, and intellectual property rights.
Read original articleThe CEO of Microsoft AI, Mustafa Suleyman, stated that machine-learning companies can use online content as "freeware" to train neural networks. This led to legal action from the Center for Investigative Reporting against OpenAI and Microsoft for using content without permission. Several lawsuits have been filed against these companies for alleged content misappropriation. Suleyman mentioned a distinction between freely available online content and copyrighted material. The legal uncertainty around using copyrighted data to train AI models has raised concerns about the future of content creation and intellectual property rights. Experts suggest that policymakers need to address these issues to balance rights and responsibilities in the AI era. The ongoing debate highlights the evolving landscape of AI model training and the potential impact on content creators and the AI industry.
Related
OpenAI and Anthropic are ignoring robots.txt
Two AI startups, OpenAI and Anthropic, are reported to be disregarding robots.txt rules, allowing them to scrape web content despite claiming to respect such regulations. TollBit analytics revealed this behavior, raising concerns about data misuse.
Record Labels Sue Two Startups for Training AI Models on Their Songs
Major record labels sue AI startups Suno AI and Uncharted Labs Inc. for using copyrighted music to train AI models. Lawsuits seek damages up to $150,000 per infringed work, reflecting music industry's protection of intellectual property.
Microsoft's AI boss Suleyman has a curious understanding of web copyright law
Microsoft's AI boss, Mustafa Suleyman, suggests open web content is free to copy, sparking copyright controversy. AI firms debate fair use of copyrighted material for training, highlighting legal complexities and intellectual property concerns.
Microsoft says that it's okay to steal web content it because it's 'freeware.'
Microsoft's CEO of AI, Mustafa Suleyman, believes web content is "freeware" for AI training unless specified otherwise. This stance has sparked legal disputes and debates over copyright infringement and fair use in AI content creation.
All web "content" is freeware
Microsoft's CEO of AI discusses open web content as freeware since the 90s, raising concerns about AI-generated content quality and sustainability. Generative AI vendors defend practices amid transparency and accountability issues. Experts warn of a potential tech industry bubble.
The following rules are agreed upon by pretty much every country that has an interest in copyright: https://www.wipo.int/treaties/en/ip/berne/summary_berne.html
The cynicism is mind-blowing. Artificial Stupidities have created exactly zero knowledge so far, which is why companies like Microsoft now roll out people like Suleyman who openly admits that the DMCA is only for the rich with lawyers.
And steal and repackage "content" from altruistic creators.
This one might bite back Microsoft. If any case goes to the Supreme Court, I'm not sure that they'll be amused by this line of logic.
We post regularly online without any expectation of payment but then we never considered that the output could or would be used for commercial purpose. The value of our collective output is being captured by a very small elite. Not sure what we can do about this, other than support alternative eco-systems at least to ensure that intra-corporate competition might keep prices low
And that social contract did not include AI. You put content on the internet because you had certain expectations on how it'll be seen or used.
The EU is probably already acting on it behind closed doors, the usual vitriol against "innovation" peddled by the American-way of doing business went pretty high against the AI Act (which is definitely far from perfect but a step into some direction to regulate it), in the near future I can see more rulings or even new regulation to address the absurdity of AI companies consuming all this data for their own profits with no compensation to the creators of it.
My personal opinion is that leaving "innovation" be the only guidance to what is "good" without any morality imbued is stupid. A lot of us has seen the cycle by now, what was innovative before becomes entrenched, the entrenched companies become behemoths, and obviously start abusing their position of power when consumers have very little options to not be in the system they created. The downfall of tech from what I experienced in the early 2000s to what it has come to be in the 2020s is just sad, it's the new 80s finance yuppie bullshit, instead of coke-addicted greedy as fuck bros we have nerdy-blabbing-about-changing-the-world greedy fucks reaping the profits.
This will get ugly, and companies doing it will deserve the retribution if they get fucked.
I am ambivalent about this overall, but few things are clear. Someone getting sued does not automatically mean they are wrong and whoever is suing is right. We don't know the rules, and hence the lawsuit. I see it being used as an evidence of wrongdoing, and seems plainly wrong. Every thing that becomes big ends up being sued (including the artists with allegations that they copied someone's work). Tells us nothing.
(I think this part is clear) Reproducing content verbatim without permission, and for profit, is plain old plagiarism, whether it's done via AI or human. In some cases, with proper citations, it is allowed, but otherwise it's a no. For summarized content, with or without credit and citations, was always allowed, but never done at this scale, so this "social contract" might need to change.
https://www.microsoft.com/en-us/software-download/windows11/
The copyright issue only comes up if you publish the output of your model. But if the AI is (somehow) clever enough to never reproduce the source material in any way that counts as copying for the purposes of copyright, then there's no copyright problem making it available to the public.
Some artists assume they have more rights than they really do and that other people aren't even allowed to mimic their style.
Some more discussion: https://news.ycombinator.com/item?id=40826588
This argument cannot possibly hold in any court. This has not been the 'contract'. I cannot reproduce the content of a newspapers online outlet, I cannot reproduce the art of another artist on Instagram, I cannot reproduce someones Youtube video without permission. This same thing sparked the whole fair-use debate some years ago.
The exceptions to these rules have always existed in limbos of regulatory grey areas and are being discussed for decades now.
This guy is still living in the Napster-era apparently and the amount of gaslighting Microsoft, OpenAI, Google etc. perform right now to freeload on data is presumptuous.
What a joke of a person. I hope court will roll over them harshly and explain world doesn’t work like that and just because you can do something it doesn’t make it right.
Read a history book.
Related
OpenAI and Anthropic are ignoring robots.txt
Two AI startups, OpenAI and Anthropic, are reported to be disregarding robots.txt rules, allowing them to scrape web content despite claiming to respect such regulations. TollBit analytics revealed this behavior, raising concerns about data misuse.
Record Labels Sue Two Startups for Training AI Models on Their Songs
Major record labels sue AI startups Suno AI and Uncharted Labs Inc. for using copyrighted music to train AI models. Lawsuits seek damages up to $150,000 per infringed work, reflecting music industry's protection of intellectual property.
Microsoft's AI boss Suleyman has a curious understanding of web copyright law
Microsoft's AI boss, Mustafa Suleyman, suggests open web content is free to copy, sparking copyright controversy. AI firms debate fair use of copyrighted material for training, highlighting legal complexities and intellectual property concerns.
Microsoft says that it's okay to steal web content it because it's 'freeware.'
Microsoft's CEO of AI, Mustafa Suleyman, believes web content is "freeware" for AI training unless specified otherwise. This stance has sparked legal disputes and debates over copyright infringement and fair use in AI content creation.
All web "content" is freeware
Microsoft's CEO of AI discusses open web content as freeware since the 90s, raising concerns about AI-generated content quality and sustainability. Generative AI vendors defend practices amid transparency and accountability issues. Experts warn of a potential tech industry bubble.