Artificial intelligence startup Perplexity has been accused by Cloudflare of using stealthy techniques to evade blockers and scrape web pages without permission. The allegations prompted Cloudflare to take action, de-listing Perplexity as a verified bot and adding heuristics to block its crawling activity.
According to Cloudflare, Perplexity’s behavior is incompatible with recommended practices for crawlers, which include transparency, serving a clear purpose, performing specific activities, and following website directives. Despite customers disabling Perplexity’s crawling activity in their robots.txt files, the company was still able to access their content.
Cloudflare has been cracking down on AI systems that scrape websites without permission, allowing customers to block or charge fees from web crawlers deployed to scrape their sites. The company cited OpenAI as an example of a company following recommended practices for crawlers and blocked behavior.
Perplexity has responded by calling Cloudflare’s actions “embarrassing” and “disqualifying”. The AI company has faced allegations of unethical web scraping in the past, including a threatened lawsuit from the BBC.
Source: https://cyberscoop.com/perplexity-blocks-on-crawlers-cloudflare