Rate Limiting Algorithms Explained with Code
TL;DR
Rate limiting prevents API abuse. Token bucket is most flexible, fixed window is simplest, sliding window is most accurate. Implement in-memory for single servers, Redis for distributed systems.
My API got hammered by a script making 10,000 requests per second. The database melted, response times hit 30 seconds, and legitimate users couldn't access the site.
I added rate limiting in 20 minutes and the problem disappeared. Malicious traffic got blocked, database load dropped 95%, and real users had fast response times again.
Rate limiting is essential for any API. Here's how the different algorithms work, when to use each, and complete implementations you can use.
What Rate Limiting Does
Rate limiting restricts how many requests a client can make in a time window:
Without rate limiting:
Client A: 10,000 requests/sec → Server accepts all → Database dies
With rate limiting (100 req/min):
Client A: 10,000 requests/sec → Server accepts 100, rejects 9,900 → Database happy
Benefits:
- Prevents abuse and DoS attacks
- Protects backend services from overload
- Ensures fair resource usage
- Reduces costs (fewer resources needed)
- Better experience for legitimate users
The Four Main Algorithms
1. Fixed Window Counter
The simplest algorithm: count requests in fixed time windows.
// In-memory fixed window rate limiter
class FixedWindowRateLimiter {
constructor(limit, windowMs) {
this.limit = limit; // Max requests per window
this.windowMs = windowMs; // Window size in milliseconds
this.counters = new Map(); // clientId -> { count, resetTime }
}
async isAllowed(clientId) {
const now = Date.now();
const client = this.counters.get(clientId);
// No existing window or window expired
if (!client || now >= client.resetTime) {
this.counters.set(clientId, {
count: 1,
resetTime: now + this.windowMs
});
return true;
}
// Within window
if (client.count < this.limit) {
client.count++;
return true;
}
// Limit exceeded
return false;
}
getRemainingTime(clientId) {
const client = this.counters.get(clientId);
if (!client) return 0;
return Math.max(0, client.resetTime - Date.now());
}
}
// Usage
const limiter = new FixedWindowRateLimiter(100, 60000); // 100 req/min
app.use(async (req, res, next) => {
const clientId = req.ip;
if (await limiter.isAllowed(clientId)) {
next();
} else {
const resetIn = Math.ceil(limiter.getRemainingTime(clientId) / 1000);
res.status(429).json({
error: 'Too many requests',
retryAfter: resetIn
});
}
});
How it works:
Time: 0s 60s 120s 180s
Window: [--1--] [--2--] [--3--] [--4--]
Requests: 100 100 100 100
Client makes 100 requests at 0s: ✅ Allowed (count: 100/100)
Client makes 1 more at 59s: ❌ Rejected (limit reached)
Client makes 100 at 60s: ✅ Allowed (new window, count: 100/100)
Pros:
- Simple to implement
- Memory efficient
- Easy to understand
Cons:
- Burst traffic at window boundaries
- Client can make 200 requests in 2 seconds (100 at end of window 1, 100 at start of window 2)
2. Sliding Window Log
More accurate: track timestamp of each request.
class SlidingWindowLogRateLimiter {
constructor(limit, windowMs) {
this.limit = limit;
this.windowMs = windowMs;
this.requests = new Map(); // clientId -> [timestamps]
}
async isAllowed(clientId) {
const now = Date.now();
const windowStart = now - this.windowMs;
// Get client's request log
let log = this.requests.get(clientId) || [];
// Remove requests outside current window
log = log.filter(timestamp => timestamp > windowStart);
// Check limit
if (log.length < this.limit) {
log.push(now);
this.requests.set(clientId, log);
return true;
}
return false;
}
getRetryAfter(clientId) {
const log = this.requests.get(clientId) || [];
if (log.length === 0) return 0;
const oldestRequest = log[0];
const windowStart = Date.now() - this.windowMs;
if (oldestRequest <= windowStart) return 0;
return Math.ceil((oldestRequest - windowStart) / 1000);
}
}
// Usage
const limiter = new SlidingWindowLogRateLimiter(100, 60000);
app.use(async (req, res, next) => {
const clientId = req.ip;
if (await limiter.isAllowed(clientId)) {
res.setHeader('X-RateLimit-Limit', 100);
res.setHeader('X-RateLimit-Remaining', /* calculate remaining */);
next();
} else {
res.status(429).json({
error: 'Too many requests',
retryAfter: limiter.getRetryAfter(clientId)
});
}
});
How it works:
Time: 0s 30s 60s 90s 120s
Requests: [------100 requests------]
Window: [--60s window--]
At 90s, looking back 60s:
- Requests at 0-30s: Outside window (ignored)
- Requests at 30-90s: Inside window (counted)
- Accurate count at any time
Pros:
- Most accurate
- No burst traffic issues
- Smooth rate limiting
Cons:
- Memory intensive (stores every request timestamp)
- Expensive for high-traffic APIs
- Cleanup required to prevent memory leaks
3. Sliding Window Counter
Hybrid approach: combines fixed window simplicity with sliding window accuracy.
class SlidingWindowCounterRateLimiter {
constructor(limit, windowMs) {
this.limit = limit;
this.windowMs = windowMs;
this.windows = new Map(); // clientId -> { current, previous, resetTime }
}
async isAllowed(clientId) {
const now = Date.now();
const currentWindow = Math.floor(now / this.windowMs);
let client = this.windows.get(clientId);
// Initialize or reset if needed
if (!client || client.window < currentWindow - 1) {
client = {
window: currentWindow,
current: 0,
previous: 0
};
this.windows.set(clientId, client);
}
// Move to next window if needed
if (client.window < currentWindow) {
client.previous = client.current;
client.current = 0;
client.window = currentWindow;
}
// Calculate weighted count
const percentageIntoWindow = (now % this.windowMs) / this.windowMs;
const weightedCount =
client.previous * (1 - percentageIntoWindow) +
client.current;
if (weightedCount < this.limit) {
client.current++;
return true;
}
return false;
}
}
// Usage
const limiter = new SlidingWindowCounterRateLimiter(100, 60000);
How it works:
Previous window: 80 requests (60s-120s)
Current window: 40 requests (120s-180s)
Current time: 150s (50% into current window)
Weighted count = 80 * (1 - 0.5) + 40 = 40 + 40 = 80
If limit is 100: ✅ Allow more requests
If limit is 70: ❌ Reject (estimated 80 requests in rolling 60s)
Pros:
- Memory efficient (only 2 counters per client)
- More accurate than fixed window
- Good performance
Cons:
- Approximate (not exact like sliding log)
- Slightly complex to understand
4. Token Bucket
Most flexible: accumulate tokens over time, spend on requests.
class TokenBucketRateLimiter {
constructor(capacity, refillRate, refillInterval = 1000) {
this.capacity = capacity; // Max tokens
this.refillRate = refillRate; // Tokens added per interval
this.refillInterval = refillInterval; // Interval in ms
this.buckets = new Map(); // clientId -> { tokens, lastRefill }
}
async isAllowed(clientId, cost = 1) {
const now = Date.now();
let bucket = this.buckets.get(clientId);
if (!bucket) {
bucket = {
tokens: this.capacity,
lastRefill: now
};
this.buckets.set(clientId, bucket);
}
// Refill tokens based on time passed
const timePassed = now - bucket.lastRefill;
const intervalsElapsed = Math.floor(timePassed / this.refillInterval);
if (intervalsElapsed > 0) {
const tokensToAdd = intervalsElapsed * this.refillRate;
bucket.tokens = Math.min(this.capacity, bucket.tokens + tokensToAdd);
bucket.lastRefill = now;
}
// Try to consume tokens
if (bucket.tokens >= cost) {
bucket.tokens -= cost;
return true;
}
return false;
}
getWaitTime(clientId, cost = 1) {
const bucket = this.buckets.get(clientId);
if (!bucket || bucket.tokens >= cost) return 0;
const tokensNeeded = cost - bucket.tokens;
const intervalsNeeded = Math.ceil(tokensNeeded / this.refillRate);
return (intervalsNeeded * this.refillInterval) / 1000;
}
}
// Usage
const limiter = new TokenBucketRateLimiter(
100, // Bucket capacity
10, // Refill 10 tokens
1000 // Every 1 second
);
app.use(async (req, res, next) => {
const clientId = req.ip;
const cost = req.path.includes('/expensive') ? 10 : 1;
if (await limiter.isAllowed(clientId, cost)) {
next();
} else {
const waitTime = limiter.getWaitTime(clientId, cost);
res.status(429).json({
error: 'Too many requests',
retryAfter: Math.ceil(waitTime)
});
}
});
How it works:
Bucket capacity: 100 tokens
Refill rate: 10 tokens/second
Time 0s: Tokens = 100
Request: Cost 1, tokens = 99 ✅
Request: Cost 1, tokens = 98 ✅
... (98 more requests)
Tokens: 0
Time 1s: Refill +10 tokens = 10
Requests: Can make 10 more requests
Pros:
- Handles bursts smoothly (accumulates unused tokens)
- Different costs per endpoint
- Most flexible algorithm
- Steady-state rate limiting
Cons:
- More complex to implement
- Harder to reason about
- Memory for token state
Redis-Based Distributed Rate Limiting
For multi-server deployments, use Redis:
const Redis = require('ioredis');
const redis = new Redis();
// Fixed window with Redis
class RedisFixedWindowRateLimiter {
constructor(redis, limit, windowSeconds) {
this.redis = redis;
this.limit = limit;
this.windowSeconds = windowSeconds;
}
async isAllowed(clientId) {
const key = `rate_limit:${clientId}`;
const now = Date.now();
const windowStart = Math.floor(now / (this.windowSeconds * 1000));
const windowKey = `${key}:${windowStart}`;
const count = await this.redis.incr(windowKey);
if (count === 1) {
// First request in this window, set expiration
await this.redis.expire(windowKey, this.windowSeconds * 2);
}
return count <= this.limit;
}
}
// Token bucket with Redis (Lua script for atomicity)
class RedisTokenBucketRateLimiter {
constructor(redis, capacity, refillRate, refillInterval) {
this.redis = redis;
this.capacity = capacity;
this.refillRate = refillRate;
this.refillInterval = refillInterval;
// Lua script for atomic token bucket operation
this.script = `
local key = KEYS[1]
local capacity = tonumber(ARGV[1])
local refill_rate = tonumber(ARGV[2])
local refill_interval = tonumber(ARGV[3])
local cost = tonumber(ARGV[4])
local now = tonumber(ARGV[5])
local bucket = redis.call('HMGET', key, 'tokens', 'last_refill')
local tokens = tonumber(bucket[1]) or capacity
local last_refill = tonumber(bucket[2]) or now
-- Refill tokens
local time_passed = now - last_refill
local intervals_elapsed = math.floor(time_passed / refill_interval)
if intervals_elapsed > 0 then
tokens = math.min(capacity, tokens + (intervals_elapsed * refill_rate))
last_refill = now
end
-- Try to consume tokens
if tokens >= cost then
tokens = tokens - cost
redis.call('HMSET', key, 'tokens', tokens, 'last_refill', last_refill)
redis.call('EXPIRE', key, 3600)
return 1
else
return 0
end
`;
}
async isAllowed(clientId, cost = 1) {
const key = `token_bucket:${clientId}`;
const now = Date.now();
const result = await this.redis.eval(
this.script,
1,
key,
this.capacity,
this.refillRate,
this.refillInterval,
cost,
now
);
return result === 1;
}
}
// Usage
const limiter = new RedisTokenBucketRateLimiter(redis, 100, 10, 1000);
app.use(async (req, res, next) => {
if (await limiter.isAllowed(req.ip)) {
next();
} else {
res.status(429).send('Too many requests');
}
});
Express Middleware Implementation
Complete production-ready middleware:
const Redis = require('ioredis');
const redis = new Redis();
function createRateLimiter(options = {}) {
const {
windowMs = 60000, // 1 minute
max = 100, // 100 requests per window
message = 'Too many requests',
statusCode = 429,
headers = true, // Include rate limit headers
skipSuccessfulRequests = false,
skipFailedRequests = false,
keyGenerator = (req) => req.ip,
skip = (req) => false,
handler = (req, res) => {
res.status(statusCode).json({ error: message });
}
} = options;
return async (req, res, next) => {
if (skip(req)) {
return next();
}
const key = keyGenerator(req);
const redisKey = `rate_limit:${key}`;
const now = Date.now();
const windowStart = Math.floor(now / windowMs);
try {
// Sliding window counter with Redis
const multi = redis.multi();
multi.zadd(redisKey, now, `${now}-${Math.random()}`);
multi.zremrangebyscore(redisKey, 0, now - windowMs);
multi.zcard(redisKey);
multi.expire(redisKey, Math.ceil(windowMs / 1000) * 2);
const results = await multi.exec();
const count = results[2][1];
if (headers) {
res.setHeader('X-RateLimit-Limit', max);
res.setHeader('X-RateLimit-Remaining', Math.max(0, max - count));
res.setHeader('X-RateLimit-Reset', new Date(now + windowMs).toISOString());
}
if (count > max) {
res.setHeader('Retry-After', Math.ceil(windowMs / 1000));
return handler(req, res);
}
next();
} catch (error) {
console.error('Rate limiter error:', error);
// Fail open - allow request if Redis is down
next();
}
};
}
// Usage examples
app.use('/api', createRateLimiter({
windowMs: 60000,
max: 100
}));
app.use('/api/expensive', createRateLimiter({
windowMs: 60000,
max: 10
}));
app.use('/api/auth/login', createRateLimiter({
windowMs: 900000, // 15 minutes
max: 5, // 5 attempts
skipSuccessfulRequests: true
}));
Different Limits for Different Endpoints
const rateLimiters = {
default: createRateLimiter({ max: 100, windowMs: 60000 }),
strict: createRateLimiter({ max: 10, windowMs: 60000 }),
auth: createRateLimiter({ max: 5, windowMs: 900000 }),
public: createRateLimiter({ max: 1000, windowMs: 60000 })
};
// Apply different limits
app.use('/api', rateLimiters.default);
app.post('/api/auth/login', rateLimiters.auth);
app.use('/api/admin', rateLimiters.strict);
app.use('/api/public', rateLimiters.public);
Rate Limiting by User vs IP
function getRateLimiterKey(req) {
// Authenticated users: rate limit by user ID
if (req.user) {
return `user:${req.user.id}`;
}
// Anonymous: rate limit by IP
// Handle proxies and load balancers
const ip = req.headers['x-forwarded-for']?.split(',')[0]?.trim()
|| req.headers['x-real-ip']
|| req.connection.remoteAddress;
return `ip:${ip}`;
}
app.use(createRateLimiter({
keyGenerator: getRateLimiterKey,
max: (req) => req.user ? 1000 : 100 // Higher limit for authenticated users
}));
Tiered Rate Limiting
Different limits for different user tiers:
const TIER_LIMITS = {
free: { requests: 100, window: 3600 }, // 100/hour
basic: { requests: 1000, window: 3600 }, // 1000/hour
premium: { requests: 10000, window: 3600 }, // 10k/hour
enterprise: { requests: 100000, window: 3600 } // 100k/hour
};
function createTieredRateLimiter() {
return async (req, res, next) => {
const tier = req.user?.tier || 'free';
const limits = TIER_LIMITS[tier];
const key = `rate_limit:${tier}:${req.user?.id || req.ip}`;
const now = Date.now();
const windowStart = now - (limits.window * 1000);
const count = await redis.zcount(key, windowStart, now);
if (count >= limits.requests) {
return res.status(429).json({
error: 'Rate limit exceeded',
tier: tier,
limit: limits.requests,
window: limits.window,
upgradeUrl: '/pricing'
});
}
await redis.zadd(key, now, `${now}-${Math.random()}`);
await redis.expire(key, limits.window * 2);
res.setHeader('X-RateLimit-Tier', tier);
res.setHeader('X-RateLimit-Limit', limits.requests);
res.setHeader('X-RateLimit-Remaining', limits.requests - count - 1);
next();
};
}
app.use('/api', createTieredRateLimiter());
Cost-Based Rate Limiting
Different endpoints cost different amounts:
const ENDPOINT_COSTS = {
'GET /api/users': 1,
'POST /api/search': 5,
'POST /api/ai/generate': 100,
'GET /api/reports/export': 50
};
class CostBasedRateLimiter {
constructor(redis, budget, windowMs) {
this.redis = redis;
this.budget = budget;
this.windowMs = windowMs;
}
async isAllowed(clientId, cost) {
const key = `cost_limit:${clientId}`;
const now = Date.now();
const windowStart = now - this.windowMs;
// Get total cost in current window
const spent = await this.redis.get(`${key}:spent`) || 0;
if (parseInt(spent) + cost > this.budget) {
return false;
}
// Record this cost
await this.redis.incrby(`${key}:spent`, cost);
await this.redis.expire(`${key}:spent`, Math.ceil(this.windowMs / 1000));
return true;
}
}
const costLimiter = new CostBasedRateLimiter(redis, 1000, 60000);
app.use(async (req, res, next) => {
const endpoint = `${req.method} ${req.path}`;
const cost = ENDPOINT_COSTS[endpoint] || 1;
const clientId = req.user?.id || req.ip;
if (await costLimiter.isAllowed(clientId, cost)) {
res.setHeader('X-RateLimit-Cost', cost);
next();
} else {
res.status(429).json({
error: 'Budget exceeded',
cost: cost,
budget: 1000
});
}
});
Testing Rate Limiters
const assert = require('assert');
async function testFixedWindow() {
const limiter = new FixedWindowRateLimiter(10, 1000);
// Should allow 10 requests
for (let i = 0; i < 10; i++) {
assert(await limiter.isAllowed('client1'), `Request ${i+1} should be allowed`);
}
// 11th should be rejected
assert(!await limiter.isAllowed('client1'), '11th request should be rejected');
// Wait for window to reset
await new Promise(resolve => setTimeout(resolve, 1100));
// Should allow again
assert(await limiter.isAllowed('client1'), 'Should allow after window reset');
console.log('✅ Fixed window tests passed');
}
async function testTokenBucket() {
const limiter = new TokenBucketRateLimiter(10, 5, 1000);
// Consume all tokens
for (let i = 0; i < 10; i++) {
assert(await limiter.isAllowed('client1'), `Request ${i+1} should be allowed`);
}
// Should be rejected
assert(!await limiter.isAllowed('client1'), 'Should be rejected when tokens exhausted');
// Wait for refill
await new Promise(resolve => setTimeout(resolve, 1100));
// Should have 5 tokens now
for (let i = 0; i < 5; i++) {
assert(await limiter.isAllowed('client1'), `Request ${i+1} after refill should be allowed`);
}
assert(!await limiter.isAllowed('client1'), 'Should be rejected after using refilled tokens');
console.log('✅ Token bucket tests passed');
}
testFixedWindow();
testTokenBucket();
Monitoring and Alerts
Track rate limit metrics:
const prometheus = require('prom-client');
const rateLimitCounter = new prometheus.Counter({
name: 'rate_limit_requests_total',
help: 'Total rate limit checks',
labelNames: ['client_id', 'allowed']
});
const rateLimitRejections = new prometheus.Counter({
name: 'rate_limit_rejections_total',
help: 'Total rate limit rejections',
labelNames: ['endpoint']
});
async function rateLimitWithMetrics(clientId, limiter) {
const allowed = await limiter.isAllowed(clientId);
rateLimitCounter.inc({
client_id: clientId,
allowed: allowed.toString()
});
if (!allowed) {
rateLimitRejections.inc({
endpoint: req.path
});
}
return allowed;
}
// Alert if rejection rate > 10%
setInterval(async () => {
const metrics = await prometheus.register.metrics();
// Parse and alert if needed
}, 60000);
Algorithm Comparison
| Algorithm | Memory | Accuracy | Burst | Distributed | Complexity |
|---|---|---|---|---|---|
| Fixed Window | Low | Poor | High | Easy | Simple |
| Sliding Log | High | Perfect | Low | Hard | Medium |
| Sliding Counter | Low | Good | Medium | Easy | Medium |
| Token Bucket | Low | Good | Smooth | Medium | Complex |
When to use:
- Fixed Window: Simple APIs, low traffic, memory constrained
- Sliding Log: Need perfect accuracy, can afford memory cost
- Sliding Counter: Good balance of accuracy and efficiency
- Token Bucket: Need burst handling, different endpoint costs
Production Checklist
Before deploying rate limiting:
// 1. Handle rate limit headers
res.setHeader('X-RateLimit-Limit', limit);
res.setHeader('X-RateLimit-Remaining', remaining);
res.setHeader('X-RateLimit-Reset', resetTime);
res.setHeader('Retry-After', retryAfter);
// 2. Return proper status codes
res.status(429).json({
error: 'Too many requests',
retryAfter: 60
});
// 3. Whitelist internal services
if (req.headers['x-internal-service'] === process.env.INTERNAL_SECRET) {
return next();
}
// 4. Fail open if Redis is down
try {
const allowed = await rateLimiter.isAllowed(clientId);
} catch (error) {
console.error('Rate limiter error:', error);
return next(); // Allow request if rate limiter fails
}
// 5. Log rejections for analysis
if (!allowed) {
logger.warn('Rate limit exceeded', {
clientId,
endpoint: req.path,
userAgent: req.headers['user-agent']
});
}
// 6. Monitor metrics
rateLimitMetrics.rejections.inc();
rateLimitMetrics.allowed.inc();
Real-World Configuration
My production setup:
// Public endpoints: generous limits
app.use('/api/public', createRateLimiter({
max: 1000,
windowMs: 60000
}));
// Authenticated endpoints: per-user limits
app.use('/api', authenticate, createRateLimiter({
keyGenerator: (req) => req.user.id,
max: 5000,
windowMs: 60000
}));
// Auth endpoints: strict limits
app.post('/api/auth/login', createRateLimiter({
max: 5,
windowMs: 900000, // 15 minutes
skipSuccessfulRequests: true // Only count failed attempts
}));
app.post('/api/auth/register', createRateLimiter({
max: 3,
windowMs: 3600000 // 1 hour
}));
// Expensive operations: very strict
app.post('/api/reports/generate', createRateLimiter({
max: 10,
windowMs: 3600000 // 1 hour
}));
// Webhooks: rate limit per source
app.post('/webhooks/:source', createRateLimiter({
keyGenerator: (req) => `webhook:${req.params.source}`,
max: 1000,
windowMs: 60000
}));
The Bottom Line
Rate limiting is essential for production APIs:
Start with fixed window for simplicity, then upgrade to token bucket or sliding counter if you need better accuracy or burst handling.
Use Redis for distributed systems. In-memory works fine for single servers.
Different limits for different contexts: public vs authenticated, free vs paid tiers, cheap vs expensive endpoints.
Monitor rejections: high rejection rates mean limits are too strict or you're under attack.
Fail open: if rate limiter breaks, allow requests through. Better to risk abuse than break your entire API.
Rate limiting saved my API from melting down. Implement it before you need it - preventing abuse is easier than recovering from it.