AI bot blitzkrieg
Publishers report AI companies are scraping their sites and consuming massive bandwidth but sending little traffic back.
Websites are buckling under an onslaught of AI crawler bots from OpenAI and other companies, with reports of massive traffic surges but little engagement from human readers, reports Press Gazette. This bot blitzkrieg is costing money, consuming bandwidth, and frustrating publishers and users alike.
Trusted Reviews crashed multiple times on August 16 when AI bots scraped the site 1.6 million times in a single day. The traffic surge forced the technology review site offline, highlighting a growing crisis facing publishers across the industry.
Chris Dicker, chief executive of Candr Media Group, said the scraping resulted in just 603 actual users visiting Trusted Reviews from generative AI platforms. That’s a conversion rate of 0.037 percent, “dramatically lower than you would expect from traditional search,” he said in a LinkedIn post.
The Copilot uses generative AI to help Media Copilot editors create drafts of news stories, allowing us to cover more stories at the intersection of AI and media. Media Copilot editors carefully edit every Copilot story for accuracy, and often add relevant details, context, and links.
The data reveals a stark imbalance between what AI companies take and what they give back. Stuart Forrest, global SEO director at Bauer Media, told Press Gazette that LLM bots now account for roughly 20 percent of total scraping across publisher sites, while ChatGPT refers only 0.2 percent of all web traffic.
Many AI companies are ignoring robots.txt files that signal websites don’t want to be crawled. Dicker said OpenAI scraped Trusted Reviews 12.2 million times over three months despite being blocked, while Meta ignored the block 2.8 million times and Amazon 2.4 million times.
The financial impact extends beyond hosting costs. Smaller publishers report being forced into expensive hosting upgrades, with some facing thousands of pounds in additional annual fees. Site crashes also mean lost advertising revenue and potential brand damage.
“Some of our members have come to me and said, ‘Chris, we’ve got an issue where we are getting scraped so much that our hosting providers are saying we now need to move up a package and that is costing thousands of pounds a year,’” he said.
The engagement quality is worse, too. Visitors from generative AI platforms spend 58 percent less time on sites and view 10 percent fewer pages than average users.
Publishers are exploring solutions like Tollbit, which allows sites to set fees for AI crawlers, and Cloudflare announced in July that it would allow websites to automatically block bots by default. However, the onus remains on publishers to identify and block new bots as they emerge.
This article was AI-assisted. For more on what this means, see the disclosure above.