Skip to content

Replace KV-based rate limiting with Cloudflare's built-in Rate Limiting API #129

@neuromechanist

Description

@neuromechanist

Problem

Currently using Workers KV for both minute and hour rate limiting (workers/osa-worker/index.js:71-98), which has issues:

  1. KV writes for bot protection: Each request = 2 KV writes (minute + hour keys)

    • Adds ~10-50ms latency for bot checks on every request
    • On free plan: Would hit 1,000 writes/day limit in ~8 hours
    • On Pro plan: Unlimited but still adds latency
  2. Purpose mismatch:

    • Per-minute limit (bot prevention): Needs to be fast, per-request
    • Per-hour limit (human abuse prevention): Can tolerate latency, needs global consistency

Solution: Hybrid Approach

Use built-in Rate Limiting API for per-minute + KV for per-hour:

Benefits

  • 50% reduction in KV writes: 1 write/request (hour) instead of 2 (minute + hour)
  • Faster bot protection: <1ms (built-in API) vs ~10-50ms (KV) for critical first check
  • Global hourly limits: KV provides consistency across all edge locations
  • Pro Plan friendly: 1 write/request is totally fine on Pro Plan (unlimited)
  • Best of both: Speed where it matters (bots), global consistency where it matters (humans)

Why Hybrid vs Full Built-in API?

Built-in API limitations:

  • Only supports period: 10 or period: 60 seconds (cannot do hourly)
  • Per-location enforcement (not global)
  • Static configuration in wrangler.toml (cannot vary dev/prod dynamically)

KV advantages for hourly:

  • Supports arbitrary time windows (3600s for hourly)
  • Global consistency across all Cloudflare locations
  • Dynamic limits based on environment

Technical Implementation

Per-Minute (Built-in API)

# wrangler.toml
[[ratelimits]]
name = "RATE_LIMITER_MINUTE"
namespace_id = "1001"
simple = { limit = 10, period = 60 }

[[env.dev.ratelimits]]
name = "RATE_LIMITER_MINUTE"
namespace_id = "1002"
simple = { limit = 60, period = 60 }
// index.js - Fast bot check
const { success } = await env.RATE_LIMITER_MINUTE.limit({ key: ip });
if (!success) {
  return { allowed: false, reason: 'Too many requests per minute' };
}

Per-Hour (KV)

// index.js - Global human abuse check
const hourKey = `rl:hour:${ip}:${Math.floor(now / 3600)}`;
const hourCount = parseInt(await env.RATE_LIMITER_KV.get(hourKey) || '0');
if (hourCount >= CONFIG.RATE_LIMIT_PER_HOUR) {
  return { allowed: false, reason: 'Too many requests per hour' };
}
await env.RATE_LIMITER_KV.put(hourKey, (hourCount + 1).toString(), { expirationTtl: 7200 });

Implementation

Files to change:

  1. wrangler.toml:

    • Add [[ratelimits]] for per-minute (built-in API)
    • Keep [[kv_namespaces]] for per-hour (KV)
    • Rename KV binding to RATE_LIMITER_KV for clarity
  2. index.js:

    • Check built-in API first (per-minute, fast)
    • Then check KV (per-hour, global)
    • Only 1 KV write instead of 2
  3. README.md:

    • Document hybrid approach
    • Explain rationale (bot vs human protection)

Performance Comparison

Metric Old (KV only) New (Hybrid)
KV writes/request 2 1
Bot check latency ~10-50ms <1ms
Hourly limit scope Global ✓ Global ✓
Writes/hour @ 10 req/min 1,200 600
Pro Plan cost Unlimited Unlimited

Migration Steps

  1. Update wrangler.toml (add ratelimits, keep kv_namespaces)
  2. Update checkRateLimit() in index.js (hybrid implementation)
  3. Deploy to dev: wrangler deploy --env dev
  4. Test both limits work
  5. Deploy to production: wrangler deploy

Testing Plan

# Test per-minute limit (should hit after 10 requests in dev: 60/min)
for i in {1..15}; do 
  curl -X POST https://osa-worker-dev.yahyaqaraeen.workers.dev/hed/ask \
    -H "Content-Type: application/json" \
    -d '{"question":"test"}' \
    -w "\n%{http_code}\n"
  sleep 0.5
done

# Test per-hour limit (would need 61+ requests in 1 hour)

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1Priority 1: Critical, fix as soon as possiblecost-managementCost tracking and optimizationoperationsOperations, monitoring, and observability

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions