Problem
Currently using Workers KV for both minute and hour rate limiting (workers/osa-worker/index.js:71-98), which has issues:
-
KV writes for bot protection: Each request = 2 KV writes (minute + hour keys)
- Adds ~10-50ms latency for bot checks on every request
- On free plan: Would hit 1,000 writes/day limit in ~8 hours
- On Pro plan: Unlimited but still adds latency
-
Purpose mismatch:
- Per-minute limit (bot prevention): Needs to be fast, per-request
- Per-hour limit (human abuse prevention): Can tolerate latency, needs global consistency
Solution: Hybrid Approach
Use built-in Rate Limiting API for per-minute + KV for per-hour:
Benefits
- ✅ 50% reduction in KV writes: 1 write/request (hour) instead of 2 (minute + hour)
- ✅ Faster bot protection: <1ms (built-in API) vs ~10-50ms (KV) for critical first check
- ✅ Global hourly limits: KV provides consistency across all edge locations
- ✅ Pro Plan friendly: 1 write/request is totally fine on Pro Plan (unlimited)
- ✅ Best of both: Speed where it matters (bots), global consistency where it matters (humans)
Why Hybrid vs Full Built-in API?
Built-in API limitations:
- Only supports
period: 10 or period: 60 seconds (cannot do hourly)
- Per-location enforcement (not global)
- Static configuration in wrangler.toml (cannot vary dev/prod dynamically)
KV advantages for hourly:
- Supports arbitrary time windows (3600s for hourly)
- Global consistency across all Cloudflare locations
- Dynamic limits based on environment
Technical Implementation
Per-Minute (Built-in API)
# wrangler.toml
[[ratelimits]]
name = "RATE_LIMITER_MINUTE"
namespace_id = "1001"
simple = { limit = 10, period = 60 }
[[env.dev.ratelimits]]
name = "RATE_LIMITER_MINUTE"
namespace_id = "1002"
simple = { limit = 60, period = 60 }
// index.js - Fast bot check
const { success } = await env.RATE_LIMITER_MINUTE.limit({ key: ip });
if (!success) {
return { allowed: false, reason: 'Too many requests per minute' };
}
Per-Hour (KV)
// index.js - Global human abuse check
const hourKey = `rl:hour:${ip}:${Math.floor(now / 3600)}`;
const hourCount = parseInt(await env.RATE_LIMITER_KV.get(hourKey) || '0');
if (hourCount >= CONFIG.RATE_LIMIT_PER_HOUR) {
return { allowed: false, reason: 'Too many requests per hour' };
}
await env.RATE_LIMITER_KV.put(hourKey, (hourCount + 1).toString(), { expirationTtl: 7200 });
Implementation
Files to change:
-
wrangler.toml:
- Add
[[ratelimits]] for per-minute (built-in API)
- Keep
[[kv_namespaces]] for per-hour (KV)
- Rename KV binding to
RATE_LIMITER_KV for clarity
-
index.js:
- Check built-in API first (per-minute, fast)
- Then check KV (per-hour, global)
- Only 1 KV write instead of 2
-
README.md:
- Document hybrid approach
- Explain rationale (bot vs human protection)
Performance Comparison
| Metric |
Old (KV only) |
New (Hybrid) |
| KV writes/request |
2 |
1 |
| Bot check latency |
~10-50ms |
<1ms |
| Hourly limit scope |
Global ✓ |
Global ✓ |
| Writes/hour @ 10 req/min |
1,200 |
600 |
| Pro Plan cost |
Unlimited |
Unlimited |
Migration Steps
- Update wrangler.toml (add ratelimits, keep kv_namespaces)
- Update checkRateLimit() in index.js (hybrid implementation)
- Deploy to dev:
wrangler deploy --env dev
- Test both limits work
- Deploy to production:
wrangler deploy
Testing Plan
# Test per-minute limit (should hit after 10 requests in dev: 60/min)
for i in {1..15}; do
curl -X POST https://osa-worker-dev.yahyaqaraeen.workers.dev/hed/ask \
-H "Content-Type: application/json" \
-d '{"question":"test"}' \
-w "\n%{http_code}\n"
sleep 0.5
done
# Test per-hour limit (would need 61+ requests in 1 hour)
References
Problem
Currently using Workers KV for both minute and hour rate limiting (workers/osa-worker/index.js:71-98), which has issues:
KV writes for bot protection: Each request = 2 KV writes (minute + hour keys)
Purpose mismatch:
Solution: Hybrid Approach
Use built-in Rate Limiting API for per-minute + KV for per-hour:
Benefits
Why Hybrid vs Full Built-in API?
Built-in API limitations:
period: 10orperiod: 60seconds (cannot do hourly)KV advantages for hourly:
Technical Implementation
Per-Minute (Built-in API)
Per-Hour (KV)
Implementation
Files to change:
wrangler.toml:
[[ratelimits]]for per-minute (built-in API)[[kv_namespaces]]for per-hour (KV)RATE_LIMITER_KVfor clarityindex.js:
README.md:
Performance Comparison
Migration Steps
wrangler deploy --env devwrangler deployTesting Plan
References