Open Source devs say AI crawlers dominate traffic, forcing blocks on entire countries

March 26, 2025

113

“Any time one of these crawlers pulls from my tarpit, it’s resources they’ve consumed and will have to pay hard cash for,” Aaron explained to Ars. “It effectively raises their costs. And seeing how none of them have turned a profit yet, that’s a big problem for them.”

On Friday, Cloudflare announced “AI Labyrinth,” a similar but more commercially polished approach. Unlike Nepenthes, which is designed as an offensive weapon against AI companies, Cloudflare positions its tool as a legitimate security feature to protect website owners from unauthorized scraping, as we reported at the time.

“When we detect unauthorized crawling, rather than blocking the request, we will link to a series of AI-generated pages that are convincing enough to entice a crawler to traverse them,” Cloudflare explained in its announcement. The company reported that AI crawlers generate over 50 billion requests to their network daily, accounting for nearly 1 percent of all web traffic they process.

The community is also developing collaborative tools to help protect against these crawlers. The “ai.robots.txt” project offers an open list of web crawlers associated with AI companies and provides premade robots.txt files that implement the Robots Exclusion Protocol, as well as .htaccess files that return error pages when detecting AI crawler requests.

As it currently stands, both the rapid growth of AI-generated content overwhelming online spaces and aggressive web-crawling practices by AI firms threaten the sustainability of essential online resources. The current approach taken by some large AI companies—extracting vast amounts of data from open-source projects without clear consent or compensation—risks severely damaging the very digital ecosystem on which these AI models depend.

Responsible data collection may be achievable if AI firms collaborate directly with the affected communities. However, prominent industry players have shown little incentive to adopt more cooperative practices. Without meaningful regulation or self-restraint by AI firms, the arms race between data-hungry bots and those attempting to defend open source infrastructure seems likely to escalate further, potentially deepening the crisis for the digital ecosystem that underpins the modern Internet.

Source link

Open Source devs say AI crawlers dominate traffic, forcing blocks on entire countries

3 great free movies to stream this weekend (April 18-20)

Xbox Support agent accidentally dishes on Oblivion remaster, may launch in four days

Nvidia Releases Biggest Bug-Fixing Update With the 576.02 Driver

LEAVE A REPLY Cancel reply

Most Popular

What Type of Corporate Venture Builder Are You?

Scientists Find Possible Hints of Life On A Distant Planet

Adam Silver on Luka Dončić trade: ‘I understand why Mavericks fans are so upset’

3 great free movies to stream this weekend (April 18-20)

Recent Comments

EDITOR PICKS

What Type of Corporate Venture Builder Are You?

Scientists Find Possible Hints of Life On A Distant Planet

Adam Silver on Luka Dončić trade: ‘I understand why Mavericks fans are so upset’

POPULAR POSTS

What Type of Corporate Venture Builder Are You?

Scientists Find Possible Hints of Life On A Distant Planet

Adam Silver on Luka Dončić trade: ‘I understand why Mavericks fans are so upset’

POPULAR CATEGORY

ABOUT US

FOLLOW US