name: ratelimit description: Rate limiting algorithms, quota management, throttling, API protection.
Activate When
/godmode:ratelimit, "rate limit", "throttle"- "API quota", "token bucket", "sliding window"
- "DDoS protection", "abuse prevention", "429"
- "rate limiting", "rate limit middleware", "throttling", "request throttle"
Workflow
1. Assessment
grep -r "rate.limit\|ratelimit\|throttle" \
package.json requirements.txt go.mod 2>/dev/null
grep -r "rateLim\|throttle\|RateLimit" \
--include="*.ts" --include="*.py" -l 2>/dev/null
Current: None | Basic (nginx) | App-level | Multi-layer
API Surface: public <N>, authenticated <N>, internal <N>
Risk: unauthenticated abuse, cost-sensitive endpoints
2. Algorithm Selection
- Token Bucket: burst tolerance, most common. Start here. Bucket holds N tokens, refilled at rate.
- Leaky Bucket: smooth output rate, traffic shaping.
- Fixed Window Counter: simple but 2x burst at edges.
- Sliding Window Log: exact, O(n) memory. Billing.
- Sliding Window Counter (RECOMMENDED): weighted average of current+previous window. O(1) memory. Used by Cloudflare, Stripe.
IF unsure: use sliding window counter. IF burst tolerance needed: use token bucket.
3. Tier Design
Anonymous: 20/min, burst 5, 1K/day
Free: 60/min, burst 15, 10K/day
Basic/$29: 300/min, burst 50, 100K/day
Pro/$99: 1K/min, burst 200, 1M/day
Enterprise: 5K/min, burst 1K, unlimited
Internal: no limit (bypass)
Endpoint overrides (stricter):
- POST /auth/login: 5/min (brute force)
- POST /auth/register: 3/min
- POST /upload: 10/hour (storage cost)
- POST /ai/generate: 20/min (compute cost)
- GET /export: 5/hour (exfiltration)
Resolution: endpoint > user tier > global IP.
4. Response Headers
Standard: RateLimit-Limit, RateLimit-Remaining,
RateLimit-Reset. On 429: add Retry-After.
Set headers on EVERY response (success and 429).
5. Distributed Rate Limiting (Redis)
Lua script for atomic sliding window counter.
Without shared state, N instances * limit = N*limit.
Load script via SCRIPT LOAD, call via EVALSHA.
6. Middleware
Resolve client key (API key or IP), resolve tier, call Redis Lua, set headers, return 429 when denied. Skip health checks and internal paths.
7. Graceful Degradation
RULE: Rate limiter failure NEVER causes app failure. Fail OPEN when Redis is down. Log warning. Optional local in-memory fallback (less accurate).
8. Quota Management
Rate limit = short window (100/min).
Quota = long window (10K/day, 1M/month).
INCR quota:{key}:{date} with 2-day TTL.
Warn at 75%, 90%, 100%. Optional overage billing.
9. Monitoring
Metrics: requests_total, rejected_total (>100/min), rejected_ratio (>10%), latency P95 (>5ms), redis_errors, failopen_total (>0), quota_usage (>90%).
10. Validation
All public endpoints protected, auth endpoints strict (<=5/min), headers on all responses, 429 has Retry-After, atomic ops, fail-open, tier limits set.
Hard Rules
- NEVER non-atomic check-and-increment (race bypass).
- NEVER fail closed when Redis unavailable.
- ALWAYS return rate limit headers on every response.
- NEVER same limits for authed and unauthed traffic.
- ALWAYS sliding window or token bucket for users.
- NEVER app-memory state for multi-instance deploys.
- ALWAYS exempt internal service traffic.
- NEVER log rate-limited request bodies.
TSV Logging
Append .godmode/ratelimit.tsv:
timestamp algorithm storage tiers endpoint_overrides quota status
Keep/Discard
KEEP if: limit enforced atomically AND headers present
AND fail-open on Redis down.
DISCARD if: race condition allows bypass
OR 429 missing Retry-After OR fail-closed.
Stop Conditions
STOP when FIRST of:
- All public endpoints protected
- Tiers configured
- Fail-open verified
Autonomous Operation
On failure: git reset --hard HEAD~1. Never pause.
<!-- tier-3 -->Error Recovery
| Failure | Action |
|---|---|
| Redis unavailable | Fail open, in-memory fallback |
| Clients bypassing | Check atomic ops, add IP limit |
| 429 no Retry-After | Add to response handler |
| Limits too strict | Analyze traffic, increase burst |
| Lua script errors | Check Redis version, reload |