http {
# ... other http settings
limit_req_zone $binary_remote_addr zone=mylimit:10m rate=10r/s;
# ...
}
server {
# ... other server settings
location / {
limit_req zone=mylimit burst=20 nodelay;
# ... proxy_pass or other location-specific settings
}
}
Rate limit read-only access at the very least. I know this is a hard problem for open source projects that have relied on web access like this for a while. Anubis?
As noted by others, the scrapers do not seem to respond to rate limiting. When you're being hit by 10-100k different IP's per hour and they don't respond to rate limiting, rate limiting isn't very effective.