Miasma Poisons AI Training Data

Miasma Poisons AI Training Data

29 March 2026

What happened

Austin Weeks released Miasma, an open-source tool designed to trap AI web scrapers by serving intentionally "poisoned" training data and self-referential links. This system aims to degrade the quality of AI models that ingest the corrupted data, effectively turning unwanted scraping into a resource drain for model developers. Miasma operates with a minimal memory footprint, consuming approximately 50-60 MB for 50 concurrent connections, and requires users to embed hidden links on their websites to direct scrapers to a Miasma instance, alongside careful robots.txt configuration to protect legitimate bots.

Why it matters

This development provides content creators with a new defence mechanism against unauthorised data scraping, directly impacting founders, CTOs, and architects managing digital assets and intellectual property. The mechanism involves serving corrupted data to degrade AI models, with Miasma's low resource footprint offering a cost-effective deployment metric. A key constraint is the necessity for precise manual setup and robots.txt configuration to prevent unintended blocking of beneficial web crawlers. This follows a broader trend of tools emerging to combat AI scraping, including Cloudflare's AI Labyrinth, which uses AI-generated content to confuse and waste scraper resources.

Source:github.com

AI generated content may differ from the original.

Published on 29 March 2026

Subscribe for Weekly Updates

Stay ahead with our weekly AI and tech briefings, delivered every Tuesday.

Miasma Poisons AI Training Data