AI Use in American Newspapers

This repo hosts AI News Audit (AI use in American newspapers is widespread, uneven, and rarely disclosed), analyzing 250,000+ news articles to detect and track AI-generated content across different media sources.

🌐 Website: https://ainewsaudit.github.io/

Authors: Jenna Russell, Marzena Karpinska, Destiny Akinode, Katherine Thai, Bradley Emi, Max Spero, and Mohit Iyyer

Introduction

AI is rapidly transforming journalism, but the extent of its use in published U.S. newspaper articles remains unclear. We address this gap by auditing a large-scale dataset of 186K articles from 1.5K American newspapers published in the summer of 2025. Using Pangram, a state-of-the-art AI detector, we discover that approximately 9% of newly-published articles are either partially or fully AI-generated. This AI use is unevenly distributed, appearing more frequently in smaller, local outlets, in specific topics such as weather and technology, and within certain ownership groups. We also analyze 45K opinion pieces from Washington Post, New York Times, and Wall Street Journal, finding that they are 6.4 times more likely to contain AI-generated content than news articles from the same publications, with many AI-flagged op-eds authored by prominent public figures. Despite this prevalence, we find that AI use is rarely disclosed: a manual audit of 100 AI-flagged articles found only five disclosures of AI use. Overall, our audit highlights the immediate need for greater transparency and updated editorial standards regarding the use of AI in journalism to maintain public trust.

💻 Code

Code coming soon!

🔍 What This Site Does

This platform helps you understand the prevalence of AI-generated content in news media by analyzing articles from our three datasets:

Recent News: 186,512 articles from various news sources
Opinions: 44,803 opinion pieces and editorials from WSJ, NYT, and WaPo
Reporters: 20,131 articles from reporter-specific sources

Our data was collected from publicly accessible newspaper sites, either through RSS feeds or available archives. Given the sensitivity of large-scale text collection, we do not release the complete article texts, but instead provide metadata to respect the rights of content owners.

📊 Understanding the Data

AI Detection Categories

We use Pangram to detect AI use.

Human: Content written entirely by humans
Mixed: Content with some AI-generated elements
AI: Content entirely generated by AI

Key Metrics

AI Likelihood: Overall probability the article contains AI content
Max AI Likelihood: Highest detection score from any segment of the article

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data		data
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Use in American Newspapers

Introduction

💻 Code

🔍 What This Site Does

📊 Understanding the Data

AI Detection Categories

Key Metrics

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Folders and files

Latest commit

History

Repository files navigation

AI Use in American Newspapers

Introduction

💻 Code

🔍 What This Site Does

📊 Understanding the Data

AI Detection Categories

Key Metrics

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Packages