Windows Update Is Killing SSDs! Should You Switch to Linux?
The moment to make the move to Linux is now.
Perplexity keeps crawling websites, even when it's told no, says Cloudflare.
Yesterday, Cloudflare published a report accusing Perplexity AI of stealthily bypassing website restrictions. The cloud services firm said Perplexity continued to crawl content from sites that had explicitly banned it. The evidence came from a controlled test designed to trap unauthorized bots.
Perplexity had previously been blocked using both robots.txt
files and firewall rules. Despite these clear signals, the AI chatbot still returned content from the restricted websites.
This is not the first time Perplexity has been under scrutiny. The company has faced multiple accusations of ignoring consent and reusing content without permission.
Source: Cloudflare
Cloudflare's investigation has revealed sophisticated evasion tactics. When Perplexity's official crawlers got blocked, the company just deployed masqueraded user agents pretending to be regular Chrome browsers on Mac computers.
These undeclared crawlers used rotating IPs and different ASN attempts to further dodge website blocks. This let them slip right past firewall protections that were specifically designed to keep their known crawler addresses out.
Cloudflare tested this sneaky behavior using brand-new domains that had never been indexed by any search engine. Despite having strict blocking rules in place, Perplexity still somehow accessed and returned detailed information from these supposedly protected sites (as shown above).
And the scale of this? Absolutely massive! We're talking millions of daily requests hitting tens of thousands of domains. This clearly wasn't some accidental oversight but systematic circumvention.
When these crawlers were successfully blocked, Perplexity's answers became less detailed, proving the blocks were working.
Cloudflare expects these subversive bots to come crawling back with even sneakier tactics. The company warns that bot evasion techniques will just keep evolving as AI firms continue trying to evade detection.
And now, in a turn of events, Perplexity has fired back at Cloudflare’s claims, saying they don’t crawl the web like traditional bots. Instead, they say their AI only visits websites when users ask specific questions, and the content is used immediately, not stored or used for training.
They also claimed Cloudflare got it wrong by confusing Perplexity’s limited use of a third-party tool, Browserbase. According to Perplexity, Cloudflare’s entire argument was based on bad traffic analysis, misleading diagrams, and a lack of understanding about how modern AI assistants actually work.
💬 What do you think of this situation? Who is to blame here?
Stay updated with relevant Linux news, discover new open source apps, follow distro releases and read opinions