Cloudflare has accused Perplexity AI of bypassing website restrictions designed to prevent content scraping. According to Cloudflare, Perplexity's crawlers continued to access and scrape data even after website owners implemented technical blocks, including robots.txt directives and WAF rules.
It is alleged that Perplexity employs tactics to evade detection, such as modifying its user agent, rotating IPs, and using different ASNs. This behaviour has raised concerns about intellectual property rights and the need for clearer guidelines on web data usage. Some have observed Perplexity using generic browser identities to impersonate legitimate user agents like Google Chrome to circumvent blocks. When blocked, Perplexity reportedly uses other sources, albeit with less specific results.
Cloudflare has stated they are working to fingerprint the stealth crawlers using machine learning and network signals. The findings have sparked debate about the ethics and business practices of AI-powered answer engines.




