July 26th, 2024

Anthropic accused of 'egregious' data scraping

AI start-up Anthropic faces accusations of aggressive data scraping, disrupting web publishers' services. Despite claims of compliance, concerns grow over ethical practices and potential violations of terms of service.

Read original article

Anthropic accused of 'egregious' data scraping

AI start-up Anthropic is facing accusations of aggressive data scraping from various websites to train its AI models, potentially violating publishers' terms of service. The company, founded by former OpenAI researchers, has been described as "the most aggressive scraper" by Matt Barrie, CEO of Freelancer.com, which reported 3.5 million visits from an Anthropic-linked crawler in just four hours. Other web publishers have echoed similar concerns, stating that Anthropic's bots have ignored requests to cease data collection, leading to significant traffic spikes that disrupt their services and revenue.

Anthropic claims to respect publishers' requests and aims to minimize disruption, asserting that its crawlers comply with standard protocols like 'robots.txt'. However, the practice of data scraping has intensified in recent years due to the AI industry's growth, imposing new costs on website operators. For instance, Kyle Wiens, CEO of iFixit.com, reported one million hits from Anthropic bots in 24 hours, highlighting the strain on their resources.

While scraping publicly available data is generally legal, it raises ethical concerns and can breach terms of service. As competition among AI companies escalates, the backlash against aggressive scraping practices may grow, prompting calls for more respectful engagement with content providers. Anthropic, which aims to develop responsible AI, has not publicly announced partnerships with publishers, unlike some of its competitors.

OpenAI and Anthropic are ignoring robots.txt

Two AI startups, OpenAI and Anthropic, are reported to be disregarding robots.txt rules, allowing them to scrape web content despite claiming to respect such regulations. TollBit analytics revealed this behavior, raising concerns about data misuse.

Anthropic CEO on Being an Underdog, AI Safety, and Economic Inequality

Anthropic's CEO, Dario Amodei, emphasizes AI progress, safety, and economic equality. The company's advanced AI system, Claude 3.5 Sonnet, competes with OpenAI, focusing on public benefit and multiple safety measures. Amodei discusses government regulation and funding for AI development.

iFixit takes shots at Anthropic for hitting servers a million times in 24 hours

iFixit CEO Kyle Wiens criticized Anthropic for making excessive requests to their servers, violating terms of service. He expressed frustration over unauthorized scraping, highlighting concerns about AI companies accessing content without permission.

AI crawlers need to be more respectful

Read the Docs has reported increased abusive AI crawling, leading to high bandwidth costs. They are blocking offenders and urging AI companies to adopt respectful practices and improve crawler efficiency.

iFixit CEO takes shots at Anthropic for hitting servers a million times in 24h

iFixit CEO Kyle Wiens criticized Anthropic for making excessive requests to their servers, violating terms of service. This incident highlights concerns about AI companies ignoring website policies and ethical data scraping issues.

0 comments

Anthropic accused of 'egregious' data scraping

Related

OpenAI and Anthropic are ignoring robots.txt

Anthropic CEO on Being an Underdog, AI Safety, and Economic Inequality

iFixit takes shots at Anthropic for hitting servers a million times in 24 hours

AI crawlers need to be more respectful

iFixit CEO takes shots at Anthropic for hitting servers a million times in 24h

Related

OpenAI and Anthropic are ignoring robots.txt

Anthropic CEO on Being an Underdog, AI Safety, and Economic Inequality

iFixit takes shots at Anthropic for hitting servers a million times in 24 hours

AI crawlers need to be more respectful

iFixit CEO takes shots at Anthropic for hitting servers a million times in 24h