Skip to content

Datasette IP Rate Limit Plugin Against Aggressive Crawlers

In a nutshell: Datasette.io deploys a GPT-assisted rate-limiting plugin that restricts aggressive crawlers through IP blocking.

The Datasette.io portal uses a new plugin for IP-based throttling to protect itself from intrusive web crawlers. The configurable system blocks addresses that request specific pages disproportionately often.

The Datasette.io portal has been targeted by uncontrolled crawler traffic that systematically bombarded individual sections of the site. To solve this problem, a Python-based plugin called datasette-ip-rate-limit was developed that throttles incoming requests at the IP address level.

The current production configuration uses the following parameters: The system uses the “Fly-Client-IP” header for identification and manages up to 10,000 IP addresses in parallel. Configurable paths such as “/static/*” and “/-/turnstile*” are exempt from throttling. For demo databases (paths such as “/global-power-plants/*” and “/legislators/*”), stricter limits apply: a maximum of 60 requests per 60-second window. If exceeded, the IP is blocked for 20 seconds.

The plugin thus provides administrators with granular control over access patterns. Different rate limits can be defined per section, while legitimate static content and specific services can be exempted. The blocking logic is based on sliding window counting rather than fixed timeouts, which allows for more flexible throttling.


Source: ainews-dev.lumi-systems.io · Published May 14, 2026
Lumi AI News — AI-assisted curation in accordance with Art. 50 EU AI Act. Paraphrase and classification by Lumi News Pipeline v1.5.2.

Share on: