Rate Limiting Algorithms Explained with Code

TL;DR

Rate limiting prevents API abuse. Token bucket is most flexible, fixed window is simplest, sliding window is most accurate. Implement in-memory for single servers, Redis for distributed systems.

My API got hammered by a script making 10,000 requests per second. The database melted, response times hit 30 seconds, and legitimate users couldn't access the site.

I added rate limiting in 20 minutes and the problem disappeared. Malicious traffic got blocked, database load dropped 95%, and real users had fast response times again.

Rate limiting is essential for any API. Here's how the different algorithms work, when to use each, and complete implementations you can use.

What Rate Limiting Does

Rate limiting restricts how many requests a client can make in a time window:

Without rate limiting:
Client A: 10,000 requests/sec → Server accepts all → Database dies

With rate limiting (100 req/min):
Client A: 10,000 requests/sec → Server accepts 100, rejects 9,900 → Database happy

Benefits:

  • Prevents abuse and DoS attacks
  • Protects backend services from overload
  • Ensures fair resource usage
  • Reduces costs (fewer resources needed)
  • Better experience for legitimate users

The Four Main Algorithms

1. Fixed Window Counter

The simplest algorithm: count requests in fixed time windows.

// In-memory fixed window rate limiter
class FixedWindowRateLimiter {
    constructor(limit, windowMs) {
        this.limit = limit;          // Max requests per window
        this.windowMs = windowMs;    // Window size in milliseconds
        this.counters = new Map();   // clientId -> { count, resetTime }
    }

    async isAllowed(clientId) {
        const now = Date.now();
        const client = this.counters.get(clientId);

        // No existing window or window expired
        if (!client || now >= client.resetTime) {
            this.counters.set(clientId, {
                count: 1,
                resetTime: now + this.windowMs
            });
            return true;
        }

        // Within window
        if (client.count < this.limit) {
            client.count++;
            return true;
        }

        // Limit exceeded
        return false;
    }

    getRemainingTime(clientId) {
        const client = this.counters.get(clientId);
        if (!client) return 0;
        return Math.max(0, client.resetTime - Date.now());
    }
}

// Usage
const limiter = new FixedWindowRateLimiter(100, 60000); // 100 req/min

app.use(async (req, res, next) => {
    const clientId = req.ip;

    if (await limiter.isAllowed(clientId)) {
        next();
    } else {
        const resetIn = Math.ceil(limiter.getRemainingTime(clientId) / 1000);
        res.status(429).json({
            error: 'Too many requests',
            retryAfter: resetIn
        });
    }
});

How it works:

Time:     0s      60s     120s    180s
Window:   [--1--] [--2--] [--3--] [--4--]
Requests: 100     100     100     100

Client makes 100 requests at 0s:   ✅ Allowed (count: 100/100)
Client makes 1 more at 59s:        ❌ Rejected (limit reached)
Client makes 100 at 60s:           ✅ Allowed (new window, count: 100/100)

Pros:

  • Simple to implement
  • Memory efficient
  • Easy to understand

Cons:

  • Burst traffic at window boundaries
  • Client can make 200 requests in 2 seconds (100 at end of window 1, 100 at start of window 2)

2. Sliding Window Log

More accurate: track timestamp of each request.

class SlidingWindowLogRateLimiter {
    constructor(limit, windowMs) {
        this.limit = limit;
        this.windowMs = windowMs;
        this.requests = new Map(); // clientId -> [timestamps]
    }

    async isAllowed(clientId) {
        const now = Date.now();
        const windowStart = now - this.windowMs;

        // Get client's request log
        let log = this.requests.get(clientId) || [];

        // Remove requests outside current window
        log = log.filter(timestamp => timestamp > windowStart);

        // Check limit
        if (log.length < this.limit) {
            log.push(now);
            this.requests.set(clientId, log);
            return true;
        }

        return false;
    }

    getRetryAfter(clientId) {
        const log = this.requests.get(clientId) || [];
        if (log.length === 0) return 0;

        const oldestRequest = log[0];
        const windowStart = Date.now() - this.windowMs;

        if (oldestRequest <= windowStart) return 0;
        return Math.ceil((oldestRequest - windowStart) / 1000);
    }
}

// Usage
const limiter = new SlidingWindowLogRateLimiter(100, 60000);

app.use(async (req, res, next) => {
    const clientId = req.ip;

    if (await limiter.isAllowed(clientId)) {
        res.setHeader('X-RateLimit-Limit', 100);
        res.setHeader('X-RateLimit-Remaining', /* calculate remaining */);
        next();
    } else {
        res.status(429).json({
            error: 'Too many requests',
            retryAfter: limiter.getRetryAfter(clientId)
        });
    }
});

How it works:

Time:        0s    30s    60s    90s    120s
Requests:    [------100 requests------]
Window:            [--60s window--]

At 90s, looking back 60s:
- Requests at 0-30s:  Outside window (ignored)
- Requests at 30-90s: Inside window (counted)
- Accurate count at any time

Pros:

  • Most accurate
  • No burst traffic issues
  • Smooth rate limiting

Cons:

  • Memory intensive (stores every request timestamp)
  • Expensive for high-traffic APIs
  • Cleanup required to prevent memory leaks

3. Sliding Window Counter

Hybrid approach: combines fixed window simplicity with sliding window accuracy.

class SlidingWindowCounterRateLimiter {
    constructor(limit, windowMs) {
        this.limit = limit;
        this.windowMs = windowMs;
        this.windows = new Map(); // clientId -> { current, previous, resetTime }
    }

    async isAllowed(clientId) {
        const now = Date.now();
        const currentWindow = Math.floor(now / this.windowMs);

        let client = this.windows.get(clientId);

        // Initialize or reset if needed
        if (!client || client.window < currentWindow - 1) {
            client = {
                window: currentWindow,
                current: 0,
                previous: 0
            };
            this.windows.set(clientId, client);
        }

        // Move to next window if needed
        if (client.window < currentWindow) {
            client.previous = client.current;
            client.current = 0;
            client.window = currentWindow;
        }

        // Calculate weighted count
        const percentageIntoWindow = (now % this.windowMs) / this.windowMs;
        const weightedCount =
            client.previous * (1 - percentageIntoWindow) +
            client.current;

        if (weightedCount < this.limit) {
            client.current++;
            return true;
        }

        return false;
    }
}

// Usage
const limiter = new SlidingWindowCounterRateLimiter(100, 60000);

How it works:

Previous window: 80 requests (60s-120s)
Current window:  40 requests (120s-180s)
Current time:    150s (50% into current window)

Weighted count = 80 * (1 - 0.5) + 40 = 40 + 40 = 80

If limit is 100: ✅ Allow more requests
If limit is 70:  ❌ Reject (estimated 80 requests in rolling 60s)

Pros:

  • Memory efficient (only 2 counters per client)
  • More accurate than fixed window
  • Good performance

Cons:

  • Approximate (not exact like sliding log)
  • Slightly complex to understand

4. Token Bucket

Most flexible: accumulate tokens over time, spend on requests.

class TokenBucketRateLimiter {
    constructor(capacity, refillRate, refillInterval = 1000) {
        this.capacity = capacity;           // Max tokens
        this.refillRate = refillRate;       // Tokens added per interval
        this.refillInterval = refillInterval; // Interval in ms
        this.buckets = new Map();           // clientId -> { tokens, lastRefill }
    }

    async isAllowed(clientId, cost = 1) {
        const now = Date.now();
        let bucket = this.buckets.get(clientId);

        if (!bucket) {
            bucket = {
                tokens: this.capacity,
                lastRefill: now
            };
            this.buckets.set(clientId, bucket);
        }

        // Refill tokens based on time passed
        const timePassed = now - bucket.lastRefill;
        const intervalsElapsed = Math.floor(timePassed / this.refillInterval);

        if (intervalsElapsed > 0) {
            const tokensToAdd = intervalsElapsed * this.refillRate;
            bucket.tokens = Math.min(this.capacity, bucket.tokens + tokensToAdd);
            bucket.lastRefill = now;
        }

        // Try to consume tokens
        if (bucket.tokens >= cost) {
            bucket.tokens -= cost;
            return true;
        }

        return false;
    }

    getWaitTime(clientId, cost = 1) {
        const bucket = this.buckets.get(clientId);
        if (!bucket || bucket.tokens >= cost) return 0;

        const tokensNeeded = cost - bucket.tokens;
        const intervalsNeeded = Math.ceil(tokensNeeded / this.refillRate);
        return (intervalsNeeded * this.refillInterval) / 1000;
    }
}

// Usage
const limiter = new TokenBucketRateLimiter(
    100,    // Bucket capacity
    10,     // Refill 10 tokens
    1000    // Every 1 second
);

app.use(async (req, res, next) => {
    const clientId = req.ip;
    const cost = req.path.includes('/expensive') ? 10 : 1;

    if (await limiter.isAllowed(clientId, cost)) {
        next();
    } else {
        const waitTime = limiter.getWaitTime(clientId, cost);
        res.status(429).json({
            error: 'Too many requests',
            retryAfter: Math.ceil(waitTime)
        });
    }
});

How it works:

Bucket capacity: 100 tokens
Refill rate: 10 tokens/second

Time 0s:   Tokens = 100
Request:   Cost 1, tokens = 99 ✅
Request:   Cost 1, tokens = 98 ✅
... (98 more requests)
Tokens:    0

Time 1s:   Refill +10 tokens = 10
Requests:  Can make 10 more requests

Pros:

  • Handles bursts smoothly (accumulates unused tokens)
  • Different costs per endpoint
  • Most flexible algorithm
  • Steady-state rate limiting

Cons:

  • More complex to implement
  • Harder to reason about
  • Memory for token state

Redis-Based Distributed Rate Limiting

For multi-server deployments, use Redis:

const Redis = require('ioredis');
const redis = new Redis();

// Fixed window with Redis
class RedisFixedWindowRateLimiter {
    constructor(redis, limit, windowSeconds) {
        this.redis = redis;
        this.limit = limit;
        this.windowSeconds = windowSeconds;
    }

    async isAllowed(clientId) {
        const key = `rate_limit:${clientId}`;
        const now = Date.now();
        const windowStart = Math.floor(now / (this.windowSeconds * 1000));
        const windowKey = `${key}:${windowStart}`;

        const count = await this.redis.incr(windowKey);

        if (count === 1) {
            // First request in this window, set expiration
            await this.redis.expire(windowKey, this.windowSeconds * 2);
        }

        return count <= this.limit;
    }
}

// Token bucket with Redis (Lua script for atomicity)
class RedisTokenBucketRateLimiter {
    constructor(redis, capacity, refillRate, refillInterval) {
        this.redis = redis;
        this.capacity = capacity;
        this.refillRate = refillRate;
        this.refillInterval = refillInterval;

        // Lua script for atomic token bucket operation
        this.script = `
            local key = KEYS[1]
            local capacity = tonumber(ARGV[1])
            local refill_rate = tonumber(ARGV[2])
            local refill_interval = tonumber(ARGV[3])
            local cost = tonumber(ARGV[4])
            local now = tonumber(ARGV[5])

            local bucket = redis.call('HMGET', key, 'tokens', 'last_refill')
            local tokens = tonumber(bucket[1]) or capacity
            local last_refill = tonumber(bucket[2]) or now

            -- Refill tokens
            local time_passed = now - last_refill
            local intervals_elapsed = math.floor(time_passed / refill_interval)

            if intervals_elapsed > 0 then
                tokens = math.min(capacity, tokens + (intervals_elapsed * refill_rate))
                last_refill = now
            end

            -- Try to consume tokens
            if tokens >= cost then
                tokens = tokens - cost
                redis.call('HMSET', key, 'tokens', tokens, 'last_refill', last_refill)
                redis.call('EXPIRE', key, 3600)
                return 1
            else
                return 0
            end
        `;
    }

    async isAllowed(clientId, cost = 1) {
        const key = `token_bucket:${clientId}`;
        const now = Date.now();

        const result = await this.redis.eval(
            this.script,
            1,
            key,
            this.capacity,
            this.refillRate,
            this.refillInterval,
            cost,
            now
        );

        return result === 1;
    }
}

// Usage
const limiter = new RedisTokenBucketRateLimiter(redis, 100, 10, 1000);

app.use(async (req, res, next) => {
    if (await limiter.isAllowed(req.ip)) {
        next();
    } else {
        res.status(429).send('Too many requests');
    }
});

Express Middleware Implementation

Complete production-ready middleware:

const Redis = require('ioredis');
const redis = new Redis();

function createRateLimiter(options = {}) {
    const {
        windowMs = 60000,        // 1 minute
        max = 100,               // 100 requests per window
        message = 'Too many requests',
        statusCode = 429,
        headers = true,          // Include rate limit headers
        skipSuccessfulRequests = false,
        skipFailedRequests = false,
        keyGenerator = (req) => req.ip,
        skip = (req) => false,
        handler = (req, res) => {
            res.status(statusCode).json({ error: message });
        }
    } = options;

    return async (req, res, next) => {
        if (skip(req)) {
            return next();
        }

        const key = keyGenerator(req);
        const redisKey = `rate_limit:${key}`;
        const now = Date.now();
        const windowStart = Math.floor(now / windowMs);

        try {
            // Sliding window counter with Redis
            const multi = redis.multi();
            multi.zadd(redisKey, now, `${now}-${Math.random()}`);
            multi.zremrangebyscore(redisKey, 0, now - windowMs);
            multi.zcard(redisKey);
            multi.expire(redisKey, Math.ceil(windowMs / 1000) * 2);

            const results = await multi.exec();
            const count = results[2][1];

            if (headers) {
                res.setHeader('X-RateLimit-Limit', max);
                res.setHeader('X-RateLimit-Remaining', Math.max(0, max - count));
                res.setHeader('X-RateLimit-Reset', new Date(now + windowMs).toISOString());
            }

            if (count > max) {
                res.setHeader('Retry-After', Math.ceil(windowMs / 1000));
                return handler(req, res);
            }

            next();
        } catch (error) {
            console.error('Rate limiter error:', error);
            // Fail open - allow request if Redis is down
            next();
        }
    };
}

// Usage examples
app.use('/api', createRateLimiter({
    windowMs: 60000,
    max: 100
}));

app.use('/api/expensive', createRateLimiter({
    windowMs: 60000,
    max: 10
}));

app.use('/api/auth/login', createRateLimiter({
    windowMs: 900000,  // 15 minutes
    max: 5,            // 5 attempts
    skipSuccessfulRequests: true
}));

Different Limits for Different Endpoints

const rateLimiters = {
    default: createRateLimiter({ max: 100, windowMs: 60000 }),
    strict: createRateLimiter({ max: 10, windowMs: 60000 }),
    auth: createRateLimiter({ max: 5, windowMs: 900000 }),
    public: createRateLimiter({ max: 1000, windowMs: 60000 })
};

// Apply different limits
app.use('/api', rateLimiters.default);
app.post('/api/auth/login', rateLimiters.auth);
app.use('/api/admin', rateLimiters.strict);
app.use('/api/public', rateLimiters.public);

Rate Limiting by User vs IP

function getRateLimiterKey(req) {
    // Authenticated users: rate limit by user ID
    if (req.user) {
        return `user:${req.user.id}`;
    }

    // Anonymous: rate limit by IP
    // Handle proxies and load balancers
    const ip = req.headers['x-forwarded-for']?.split(',')[0]?.trim()
               || req.headers['x-real-ip']
               || req.connection.remoteAddress;

    return `ip:${ip}`;
}

app.use(createRateLimiter({
    keyGenerator: getRateLimiterKey,
    max: (req) => req.user ? 1000 : 100  // Higher limit for authenticated users
}));

Tiered Rate Limiting

Different limits for different user tiers:

const TIER_LIMITS = {
    free: { requests: 100, window: 3600 },      // 100/hour
    basic: { requests: 1000, window: 3600 },    // 1000/hour
    premium: { requests: 10000, window: 3600 }, // 10k/hour
    enterprise: { requests: 100000, window: 3600 } // 100k/hour
};

function createTieredRateLimiter() {
    return async (req, res, next) => {
        const tier = req.user?.tier || 'free';
        const limits = TIER_LIMITS[tier];

        const key = `rate_limit:${tier}:${req.user?.id || req.ip}`;
        const now = Date.now();
        const windowStart = now - (limits.window * 1000);

        const count = await redis.zcount(key, windowStart, now);

        if (count >= limits.requests) {
            return res.status(429).json({
                error: 'Rate limit exceeded',
                tier: tier,
                limit: limits.requests,
                window: limits.window,
                upgradeUrl: '/pricing'
            });
        }

        await redis.zadd(key, now, `${now}-${Math.random()}`);
        await redis.expire(key, limits.window * 2);

        res.setHeader('X-RateLimit-Tier', tier);
        res.setHeader('X-RateLimit-Limit', limits.requests);
        res.setHeader('X-RateLimit-Remaining', limits.requests - count - 1);

        next();
    };
}

app.use('/api', createTieredRateLimiter());

Cost-Based Rate Limiting

Different endpoints cost different amounts:

const ENDPOINT_COSTS = {
    'GET /api/users': 1,
    'POST /api/search': 5,
    'POST /api/ai/generate': 100,
    'GET /api/reports/export': 50
};

class CostBasedRateLimiter {
    constructor(redis, budget, windowMs) {
        this.redis = redis;
        this.budget = budget;
        this.windowMs = windowMs;
    }

    async isAllowed(clientId, cost) {
        const key = `cost_limit:${clientId}`;
        const now = Date.now();
        const windowStart = now - this.windowMs;

        // Get total cost in current window
        const spent = await this.redis.get(`${key}:spent`) || 0;

        if (parseInt(spent) + cost > this.budget) {
            return false;
        }

        // Record this cost
        await this.redis.incrby(`${key}:spent`, cost);
        await this.redis.expire(`${key}:spent`, Math.ceil(this.windowMs / 1000));

        return true;
    }
}

const costLimiter = new CostBasedRateLimiter(redis, 1000, 60000);

app.use(async (req, res, next) => {
    const endpoint = `${req.method} ${req.path}`;
    const cost = ENDPOINT_COSTS[endpoint] || 1;
    const clientId = req.user?.id || req.ip;

    if (await costLimiter.isAllowed(clientId, cost)) {
        res.setHeader('X-RateLimit-Cost', cost);
        next();
    } else {
        res.status(429).json({
            error: 'Budget exceeded',
            cost: cost,
            budget: 1000
        });
    }
});

Testing Rate Limiters

const assert = require('assert');

async function testFixedWindow() {
    const limiter = new FixedWindowRateLimiter(10, 1000);

    // Should allow 10 requests
    for (let i = 0; i < 10; i++) {
        assert(await limiter.isAllowed('client1'), `Request ${i+1} should be allowed`);
    }

    // 11th should be rejected
    assert(!await limiter.isAllowed('client1'), '11th request should be rejected');

    // Wait for window to reset
    await new Promise(resolve => setTimeout(resolve, 1100));

    // Should allow again
    assert(await limiter.isAllowed('client1'), 'Should allow after window reset');

    console.log('✅ Fixed window tests passed');
}

async function testTokenBucket() {
    const limiter = new TokenBucketRateLimiter(10, 5, 1000);

    // Consume all tokens
    for (let i = 0; i < 10; i++) {
        assert(await limiter.isAllowed('client1'), `Request ${i+1} should be allowed`);
    }

    // Should be rejected
    assert(!await limiter.isAllowed('client1'), 'Should be rejected when tokens exhausted');

    // Wait for refill
    await new Promise(resolve => setTimeout(resolve, 1100));

    // Should have 5 tokens now
    for (let i = 0; i < 5; i++) {
        assert(await limiter.isAllowed('client1'), `Request ${i+1} after refill should be allowed`);
    }

    assert(!await limiter.isAllowed('client1'), 'Should be rejected after using refilled tokens');

    console.log('✅ Token bucket tests passed');
}

testFixedWindow();
testTokenBucket();

Monitoring and Alerts

Track rate limit metrics:

const prometheus = require('prom-client');

const rateLimitCounter = new prometheus.Counter({
    name: 'rate_limit_requests_total',
    help: 'Total rate limit checks',
    labelNames: ['client_id', 'allowed']
});

const rateLimitRejections = new prometheus.Counter({
    name: 'rate_limit_rejections_total',
    help: 'Total rate limit rejections',
    labelNames: ['endpoint']
});

async function rateLimitWithMetrics(clientId, limiter) {
    const allowed = await limiter.isAllowed(clientId);

    rateLimitCounter.inc({
        client_id: clientId,
        allowed: allowed.toString()
    });

    if (!allowed) {
        rateLimitRejections.inc({
            endpoint: req.path
        });
    }

    return allowed;
}

// Alert if rejection rate > 10%
setInterval(async () => {
    const metrics = await prometheus.register.metrics();
    // Parse and alert if needed
}, 60000);

Algorithm Comparison

Algorithm Memory Accuracy Burst Distributed Complexity
Fixed Window Low Poor High Easy Simple
Sliding Log High Perfect Low Hard Medium
Sliding Counter Low Good Medium Easy Medium
Token Bucket Low Good Smooth Medium Complex

When to use:

  • Fixed Window: Simple APIs, low traffic, memory constrained
  • Sliding Log: Need perfect accuracy, can afford memory cost
  • Sliding Counter: Good balance of accuracy and efficiency
  • Token Bucket: Need burst handling, different endpoint costs

Production Checklist

Before deploying rate limiting:

// 1. Handle rate limit headers
res.setHeader('X-RateLimit-Limit', limit);
res.setHeader('X-RateLimit-Remaining', remaining);
res.setHeader('X-RateLimit-Reset', resetTime);
res.setHeader('Retry-After', retryAfter);

// 2. Return proper status codes
res.status(429).json({
    error: 'Too many requests',
    retryAfter: 60
});

// 3. Whitelist internal services
if (req.headers['x-internal-service'] === process.env.INTERNAL_SECRET) {
    return next();
}

// 4. Fail open if Redis is down
try {
    const allowed = await rateLimiter.isAllowed(clientId);
} catch (error) {
    console.error('Rate limiter error:', error);
    return next(); // Allow request if rate limiter fails
}

// 5. Log rejections for analysis
if (!allowed) {
    logger.warn('Rate limit exceeded', {
        clientId,
        endpoint: req.path,
        userAgent: req.headers['user-agent']
    });
}

// 6. Monitor metrics
rateLimitMetrics.rejections.inc();
rateLimitMetrics.allowed.inc();

Real-World Configuration

My production setup:

// Public endpoints: generous limits
app.use('/api/public', createRateLimiter({
    max: 1000,
    windowMs: 60000
}));

// Authenticated endpoints: per-user limits
app.use('/api', authenticate, createRateLimiter({
    keyGenerator: (req) => req.user.id,
    max: 5000,
    windowMs: 60000
}));

// Auth endpoints: strict limits
app.post('/api/auth/login', createRateLimiter({
    max: 5,
    windowMs: 900000,  // 15 minutes
    skipSuccessfulRequests: true  // Only count failed attempts
}));

app.post('/api/auth/register', createRateLimiter({
    max: 3,
    windowMs: 3600000  // 1 hour
}));

// Expensive operations: very strict
app.post('/api/reports/generate', createRateLimiter({
    max: 10,
    windowMs: 3600000  // 1 hour
}));

// Webhooks: rate limit per source
app.post('/webhooks/:source', createRateLimiter({
    keyGenerator: (req) => `webhook:${req.params.source}`,
    max: 1000,
    windowMs: 60000
}));

The Bottom Line

Rate limiting is essential for production APIs:

Start with fixed window for simplicity, then upgrade to token bucket or sliding counter if you need better accuracy or burst handling.

Use Redis for distributed systems. In-memory works fine for single servers.

Different limits for different contexts: public vs authenticated, free vs paid tiers, cheap vs expensive endpoints.

Monitor rejections: high rejection rates mean limits are too strict or you're under attack.

Fail open: if rate limiter breaks, allow requests through. Better to risk abuse than break your entire API.

Rate limiting saved my API from melting down. Implement it before you need it - preventing abuse is easier than recovering from it.