The BBC just made AI’s accuracy problem impossible to ignore

By testing AI chatbots against its own reporting, the BBC showed how newsrooms can take the lead in evaluating AI’s role in journalism.

Feb 18, 2025

∙ Paid

Are hallucinations still a problem? According to a recent BBC study, the answer is yes, and probably more than you think. The British broadcaster researched AI accuracy with respect to news content, and the findings are notable — not just for shining a brighter light on the hallucination problem, but for pointing the way for journalists to be a part of the conversation that's reshaping their industry.

More on that in a minute, but first I'd like to express my gratitude to PRWeek for naming me as one of the Class of 2025 — the publication's annual list of the 25 most influential people in communications technology.

When I founded The Media Copilot, I thought I'd spend most of my time educating newsrooms about how to leverage AI in their work. As it turned out, PR agencies quickly became the majority of my clients. I've since built AI training courses entirely focused on using AI in public relations, including the course I'm now in the midst of teaching, AI for PR & Media Professionals.

That course is so ambitious that I needed to partner with two other AI educators — Peter Bittner of the The Upgrade and Kris Krüg — to design six weeks of hands-on workshopping around PR use cases like personalized pitching, influencer identification, crisis management. Response to the class has been so great that we've launched a second cohort that begins March 18. Here's the full scoop:

AI for PR & Media Professionals — Cohort 2 Starts March 18!

AI is transforming PR—are you keeping up? If you're still using AI just to draft press releases but not for crisis detection, sentiment analysis, or media monitoring, it's time to level up.

The second cohort of our game-changing AI for PR & Media Professionals course kicks off March 18. Over six weeks, you'll move beyond basic prompts and develop real AI expertise to automate workflows, enhance storytelling, and drive PR success.

What’s in it for you?

✅ Live instruction from Peter Bittner, Kris Krüg, and Pete Pachal – top experts in AI, media, and PR
✅ 1-on-1 coaching to apply AI to your unique PR challenges
✅ Capstone project tailored to your workflow
✅ AI tools for media monitoring, content creation, and strategic PR execution

💡 Early Bird Special: Enroll by February 21 and get 25% off with code EARLYBIRD25

🔗 Spots are limited—enroll now! 👇

Learn more and enroll

The BBC tested AI news summaries, and the results weren’t encouraging. (Credit: Rich Smith, Unsplash)

A BBC study aims to save AI from itself

Ever since ChatGPT first debuted to the public, we've known it sometimes makes things up. How often this happens, however, has never been entirely clear. We know it's often enough for it to be a serious impediment to fully automating informational tasks (like article writing), but many in the AI community subscribe to the notion that, as model performance improves at a rapid pace, these digital flights of fancy are on the decline.

Well, we can probably toss that notion out the window now that the BBC has published a study that directly tests the output from four separate popular AI tools. The study authors fed queries about news events into ChatGPT, Perplexity, Gemini, and Microsoft Copilot. The queries specifically pointed to BBC reporting, since the broadcaster obviously could speak authoritatively about its own reporting. They then asked a team of BBC journalists to evaluate the responses and rate them on whether they had problems and how serious those problems were.

The results don't inspire confidence to say the least. The broad takeaway is that, for the purposes of summarizing news information, the most popular AI services have problems with accuracy a majority of the time — a slim majority (51%), but still a majority. The study was done in December of 2024, so the AI models are representative of what's still in use today.

Can DeepSeek be trusted in newsrooms?

Pete Pachal

Jan 28

Read full story

The BBC didn't look solely at factual accuracy. It also gauged responses on several other factors: how the AI attributed sources, whether it drew a clear line between fact and opinion or inserted its own editorializing language, the context provided, and how it represented the BBC in the response (something the broadcaster is probably even more sensitive about after that Apple Intelligence snafu).

I think this study is an important step in the journey of making AI truly useful, not just within newsrooms but to the consumers of news. The reality is as AI gets increasingly ubiquitous, more people will use AI services to gather information. The news industry needs to offer more nuanced and constructive advice that isn't just, "Don't use it."

What the BBC gets right

While the findings on the specific queries and responses are interesting (check them out here if you'd like to dive deeper), the reasons it stands out as a piece of AI research have to do with its premise and methodology:

Keep reading with a 7-day free trial

Subscribe to The Media Copilot to keep reading this post and get 7 days of free access to the full post archives.