YouTube creators surprised to find Apple and others trained AI on their videos
YouTube creators express surprise as tech giants Apple, Salesforce, and Anthropic train AI models on YouTube videos without consent. Dataset "the Pile" by EleutherAI includes content from popular creators and media brands. Ethical concerns arise.
Read original articleYouTube creators were surprised to discover that major tech companies like Apple, Salesforce, and Anthropic had trained their AI models on tens of thousands of YouTube videos without the creators' consent. The companies utilized a dataset called "the Pile," created by EleutherAI, which includes YouTube captions scraped from over 48,000 channels, including videos from popular YouTubers like MrBeast and PewDiePie. The dataset also contained content from mainstream media brands like Ars Technica. While some creators expressed frustration at the unauthorized use of their content, companies like Anthropic defended their actions, stating that the dataset used was a small subset of YouTube subtitles and did not directly violate YouTube's terms of service. This incident sheds light on the challenges creators face in controlling how their content is used online and raises questions about the ethical implications of training AI models on publicly available data.
Related
OpenAI and Anthropic are ignoring robots.txt
Two AI startups, OpenAI and Anthropic, are reported to be disregarding robots.txt rules, allowing them to scrape web content despite claiming to respect such regulations. TollBit analytics revealed this behavior, raising concerns about data misuse.
YouTube in talks with record labels over AI music deal
YouTube is in talks with major record labels to license AI tools replicating artists' music. Some artists are wary of devaluation concerns. Negotiations aim to involve select artists for AI music generation.
Microsoft CEO of AI Your online content is 'freeware' fodder for training models
Mustafa Suleyman, CEO of Microsoft AI, faced legal action for using online content as "freeware" to train neural networks. The debate raises concerns about copyright, AI training, and intellectual property rights.
YouTube lets you request removal of AI content that simulates your face or voice
YouTube's new policy allows users to request removal of AI-generated content mimicking their face or voice to address privacy concerns. Requests are assessed based on disclosure, identification, public interest, and sensitive behaviors. Content uploaders have 48 hours to respond to complaints.
Apple trained AI models on YouTube content without consent
Tech giants, like Apple, used YouTube video subtitles without creators' consent for AI training. Concerns over legality and ethics arise as companies leverage third-party datasets, impacting creators and raising AI training ethics issues.
Source: https://www.wired.com/story/youtube-training-data-apple-nvid...
Some more discussion: https://news.ycombinator.com/item?id=40977465
Related
OpenAI and Anthropic are ignoring robots.txt
Two AI startups, OpenAI and Anthropic, are reported to be disregarding robots.txt rules, allowing them to scrape web content despite claiming to respect such regulations. TollBit analytics revealed this behavior, raising concerns about data misuse.
YouTube in talks with record labels over AI music deal
YouTube is in talks with major record labels to license AI tools replicating artists' music. Some artists are wary of devaluation concerns. Negotiations aim to involve select artists for AI music generation.
Microsoft CEO of AI Your online content is 'freeware' fodder for training models
Mustafa Suleyman, CEO of Microsoft AI, faced legal action for using online content as "freeware" to train neural networks. The debate raises concerns about copyright, AI training, and intellectual property rights.
YouTube lets you request removal of AI content that simulates your face or voice
YouTube's new policy allows users to request removal of AI-generated content mimicking their face or voice to address privacy concerns. Requests are assessed based on disclosure, identification, public interest, and sensitive behaviors. Content uploaders have 48 hours to respond to complaints.
Apple trained AI models on YouTube content without consent
Tech giants, like Apple, used YouTube video subtitles without creators' consent for AI training. Concerns over legality and ethics arise as companies leverage third-party datasets, impacting creators and raising AI training ethics issues.