Turning bot-blocking into business-building, with DataDome’s Aurélie Guerrieri

Playback speed

Share post at current time

0:00

Transcript

Turning bot-blocking into business-building, with DataDome’s Aurélie Guerrieri

Publishers don’t need bigger walls—they need dials. Here’s how to see, price, and shape AI bots and agent activity instead of getting steamrolled by it.

Pete Pachal

and

Michele Musso

Aug 29, 2025

If the last two years were about discovering that AI agents are vacuuming up the web, the next two will be about deciding what to do about it. Do you block, meter, license—or build your own agent and make the bots pay?

On this episode of The Media Copilot, host Pete Pachal welcomes Aurélie Guerrieri, Chief Marketing and Alliances Officer at DataDome, a leader in building bot defenses. Together, they dive into the new reality of AI-driven traffic: from LLM crawlers and real-time “prompt-time fetching” to the rising tide of agentic activity that acts on users’ behalf.

Instead of framing the debate as simply good bots versus bad bots, the conversation explores a more practical lens: identity versus intent, and how publishers can reclaim control, revenue, and visibility in an internet increasingly shaped by AI distribution.

Why this matters now

Scale & speed broke the old defenses. Content Delivery Networks (CDNs, servers that cache and deliver website content from locations closer to users) and Web Application Firewalls (WAFs, security systems that filter and monitor HTTP traffic between users and web applications) still matter, but they adapt in minutes. Attackers now act in seconds and from distributed IPs that look like everyday users.
AI changed the mix of traffic. DataDome sees enormous growth in prompt-time fetching—LLMs hitting your most valuable pages (latest articles, pricing, paywalled previews) 20:1 compared with traditional crawling in some cases.
The business model is shifting. “Open web” ≠ “open season.” Publishers need to decide who gets access, for what, and at what price - and they need tooling that can enforce those choices in real time.

“AI is part of the problem—and part of the solution. We use AI to fight AI.” – Aurélie Guerrieri

A MESSAGE FROM THE STATE OF DIGITAL PUBLISHING

Join the benchmark study shaping the future of first-party personalization, CDP adoption, and audience engagement in digital media & publishing

Share your insights on audience engagement, CDP, first-party data and personalization trends in publishing by taking the State of Digital Publishing survey.

Complete the survey

The new stack: from identity to intent to control

The next phase of AI governance on the open web won’t be won by bigger walls; it’ll be won by better instruments. Verification frameworks are starting to appear, offering cryptographic IDs for bots and agents to help answer the most basic question: who is knocking? This kind of identification is important, but it is only the first step.

“Identity is a great first check,” says DataDome’s Aurélie Guerrieri. “Intent is what you need to look at next. And your response should be nuanced, not all-or-nothing.”

The more meaningful work begins with intent. Publishers must understand not only who or what is hitting their sites but why. Traditional indexing and crawling are usually beneficial, providing discovery and visibility that publishers want. But prompt-time fetching, when large language models reach into live pages at the exact moment a user asks a question, is far more disruptive.

Bots vs. publishers: The coming clash over internet economics

Pete Pachal

Aug 19

Bots vs. publishers: The coming clash over internet economics

Returning from vacation this week, I wondered how things would look now that GPT-5 has been out for almost two weeks. Given all the hype prior to the launch, it felt like it was going to be a major pivot point in AI. And it still might turn out to be that, but mostly I'm struck by how the conversation has been mostly about

Read full story

These requests often target a site’s most valuable assets: fresh reporting, pricing pages, or previews designed for paid subscribers. Beyond that lies the emerging reality of agentic activity, in which automated systems log in, create trial accounts, post comments, or even transact on behalf of users. These actions can be useful when authenticated and well behaved, but they are equally capable of abuse when spoofed or hijacked.

That is why control can no longer be reduced to a simple decision between blocking or allowing traffic. Publishers need nuanced options. Some automated behaviors might be welcomed, others throttled, and still others challenged, priced, or redirected to paid endpoints. In Guerrieri’s words, “Open web is not open season on content. Publishers get to decide who has access, for what purposes, and at what cost.”

What DataDome is seeing in the wild

Static defenses are increasingly being outpaced. Traditional tools such as content delivery networks and web application firewalls adapt in minutes, while attackers now strike in seconds. DataDome reports AI-driven assaults that are larger, faster, and more distributed, often routed through residential IP addresses that appear legitimate. DDoS campaigns increasingly “hit in the two-minute window” before a site’s primary protections fully adapt, slipping through the cracks.

At the same time, the demand from large language models is shifting to high-value endpoints. New articles, pricing tables, and gated content are being hit in real time, creating both performance issues and business concerns. To keep pace, DataDome relies on hundreds of foundational AI models and tens of thousands of custom models trained for different industries. This infrastructure allows it to score intent at speed and respond within fractions of a second. Guerrieri says that the payoff is measurable: on DDoS attacks alone, the company often blocks about 20% more than a typical CDN’s first layer.

Beyond “block it all”: how leaders are adapting

Inside media companies, executives often describe a feeling of blindness. With so much traffic detouring through AI surfaces, many estimate they have lost 30 to 50 percent of visibility into what users are actually doing. The first step, Guerrieri argues, is to “restore that visibility” with a clear accounting of all automated traffic and its purpose. Only then can product, revenue, editorial, and security teams make decisions together.

Once visibility is restored, the challenge becomes mapping policies to intent. Crawling for discovery may be allowed with conditions. Prompt-time fetching might be throttled, redirected, or pushed toward APIs that can be priced. Agentic activity should be closely monitored, authenticated, and checked for behaviors such as trial abuse or comment spam.

What an AI-first content strategy looks like

Pete Pachal

Aug 5

Read full story

Pricing is the next frontier. Some publishers are experimenting with traditional licensing and enforcement, while others are joining new marketplaces that offer pay-per-crawl or pay-per-fetch models such as TollBit. Others are opening agent-ready endpoints in the form of metered APIs or customized responses that link back to their sites. The boldest are exploring ways to build their own agents, transforming authority into direct transactions. Guerrieri points to the example of a recipe site that could use an agent to scan a user’s fridge, suggest meals, assemble a shopping cart, and generate revenue from affiliate or delivery partnerships.

“The crucial factor,” she says, “is to avoid lock-in.” DataDome is working with multiple monetization partners, including TollBit and Skyfire, so that publishers can test business models without being forced to commit to a single platform. That flexibility, combined with better visibility, allows organizations to experiment and iterate quickly.

Cloudflare, Perplexity, and a market getting educated

The debate over AI bots entered the mainstream this summer when Cloudflare announced that it would begin blocking them by default. The move was a signal that first-layer defenses must evolve. But it was the public back-and-forth with Perplexity—accused of ignoring robots.txt and scraping content—that crystallized the real issues. The dispute was not about whether the scraping happened, but whether it should be considered legitimate “user activity by proxy.”

The exchange highlighted two critical lessons. First, accuracy matters. Misidentification damages both traffic and trust. Second, control must remain in the hands of publishers, who are best positioned to decide who gets access and under what conditions. Guerrieri is clear on this point: identity is the first check, but intent is what ultimately matters, and publishers need tools that allow for nuanced decisions rather than blanket bans.

If Guerrieri ran a newsroom

Asked what she would do if she were running a media company today, Guerrieri does not hesitate. Guerrieri’s advice is simple: measure, control, and experiment. Know exactly what automated traffic is hitting your site and why. Set rules that protect value without shutting the door on opportunity. Route the best demand toward paid or owned channels. Then move fast—build your own agent, test new revenue paths, and keep iterating.

Her philosophy is clear: the future of publishing won’t be about keeping bots out, but about shaping how they come in—and making them pay for the privilege.

This episode of The Media Copilot was produced by Pete Pachal, Executive Producer Michele Musso, and with video/audio editing by the Musso Media team. Produced by Musso Media. © 2025 Musso Media. All rights reserved.

Music: Favorite by Alexander Nakarada, licensed under Creative Commons by Attribution 4.0 License