July 16th, 2024

Apple trained AI models on YouTube content without consent

Tech giants, like Apple, used YouTube video subtitles without creators' consent for AI training. Concerns over legality and ethics arise as companies leverage third-party datasets, impacting creators and raising AI training ethics issues.

Read original articleLink Icon
Apple trained AI models on YouTube content without consent

Tech giants, including Apple, have reportedly trained AI models using YouTube videos without creators' consent. Subtitle files from over 170,000 videos were utilized, impacting creators like MKBHD, MrBeast, and others. The files, acting as video transcripts, were obtained by a third party and used by companies like Apple, Nvidia, and Salesforce. The downloads were facilitated by EleutherAI, a non-profit assisting AI model training. While the dataset aimed to aid small developers and academics, it was also leveraged by major tech firms. Apple utilized the dataset to train OpenELM, a model enhancing AI capabilities on iPhones and MacBooks. Despite Apple not directly downloading the data, concerns arise over legal implications and ethical use of web-scraped datasets. The situation highlights challenges in AI training ethics and the risks associated with using third-party compiled datasets. Apple had not responded to requests for comment at the time of reporting.

Related

Microsoft says that it's okay to steal web content it because it's 'freeware.'

Microsoft says that it's okay to steal web content it because it's 'freeware.'

Microsoft's CEO of AI, Mustafa Suleyman, believes web content is "freeware" for AI training unless specified otherwise. This stance has sparked legal disputes and debates over copyright infringement and fair use in AI content creation.

Microsoft CEO of AI Your online content is 'freeware' fodder for training models

Microsoft CEO of AI Your online content is 'freeware' fodder for training models

Mustafa Suleyman, CEO of Microsoft AI, faced legal action for using online content as "freeware" to train neural networks. The debate raises concerns about copyright, AI training, and intellectual property rights.

Microsoft AI CEO: Web content is 'freeware'

Microsoft AI CEO: Web content is 'freeware'

Microsoft's CEO discusses AI training on web content, emphasizing fair use unless restricted. Legal challenges arise over scraping restrictions, highlighting the balance between fair use and copyright concerns for AI development.

AI Companies Need to Be Regulated: Open Letter

AI Companies Need to Be Regulated: Open Letter

AI companies face calls for regulation due to concerns over unethical practices highlighted in an open letter by MacStories to the U.S. Congress and European Parliament. The letter stresses the need for transparency and protection of content creators.

OpenAI pleads it can't make money with o using copyrighted material for free

OpenAI pleads it can't make money with o using copyrighted material for free

OpenAI requests British Parliament to permit copyrighted material for AI training. Facing legal challenges from NYT and Authors Guild for alleged copyright infringement. Debate impacts AI development and copyright protection, raising concerns for content creators.

Link Icon 6 comments
By @jmward01 - 4 months
I hadn't realized that it was well settled that scraping from youtube isn't legitimate. How ironic that would be. Does that mean that google search results aren't legal? You get full images back from that and google makes money on those results. They have trained search algorithms on those results too. Does that mean every site that they broke the TOS of when they scraped it and trained on it can sue them? Is there a real legal precedence that has set a clear line here?
By @realengineer123 - 4 months
everyone who whined and complained about openai and msft doing similar things are going to freak out on apple too right?....right?