The Media Copilot

The Media Copilot

Share this post

The Media Copilot
The Media Copilot
Mastering AI Data Scraping for Text and Video: A Guide for Journalists
Copy link
Facebook
Email
Notes
More

Mastering AI Data Scraping for Text and Video: A Guide for Journalists

How to use AI scraping to extract data from websites and videos to power better journalism.

Christopher Allbritton's avatar
Christopher Allbritton
Nov 06, 2024
∙ Paid
1

Share this post

The Media Copilot
The Media Copilot
Mastering AI Data Scraping for Text and Video: A Guide for Journalists
Copy link
Facebook
Email
Notes
More
Share
Web scraping can be a helpful tool for journalists, and AI has made it easier to utilize than ever. (Credit: Midjourney)

For journalists today, working with data using AI and other analysis tools is crucial. But just as important is gathering that data, which isn’t always straightforward. It’s unlikely a corporation you’re investigating will hand over a nicely formatted PDF outlining all of their OSHA violations, is it?

For those unfamiliar with web scraping, it’s a valuable technique for data journalists. Skilled data journalists can code a scraper that will crawl over a website or series of websites, collect data, and return it in a usable format. The advantage of this technique is that journalists can then use this vast amount of data to tell trend stories, conduct deep-dive investigations, verify data and access otherwise inaccessible data. 

That’s the old way of doing it. Today, AI has entered the picture with AI web scrapers. These differ from traditional web scrapers in that they take less setup time and skill — many can be used without a line of code — and are more robust. Older web scrapers might struggle with websites with a dynamic layout or otherwise make it difficult to extract data.

For truly scraper-hostile websites, I’ll introduce a new technique called “video scraping” that leverages the power of AI.

Here’s a comprehensive guide on harnessing these powerful tools and diving into modern data-driven journalism.

Why AI Scraping is Key in Data Journalism

A deep ocean of data is available online, but finding it in a valuable structured form is challenging. What good is a massive table of 911 call-response data in your area if you have to input all the data into a spreadsheet by hand?

Keep reading with a 7-day free trial

Subscribe to The Media Copilot to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 AnyWho Media LLC
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More