Feature Flags: Ship Code Without Breaking Things
TL;DR
Feature flags are runtime toggles that control what code runs. Use them for gradual rollouts, A/B tests, kill switches, and beta access. Store in Redis or database, not config files. Kill switches save production incidents.
We pushed a new checkout flow at 2am to avoid traffic. It had a bug. Rolling back meant another deploy, another 10-minute wait, more downtime. The next week, I added feature flags. Now we deploy code, flip the flag when ready, and flip it off in seconds if something breaks. No deploy required.
Here's how to build a feature flag system that actually makes deployments less stressful.
What Feature Flags Are
A feature flag is a runtime conditional that controls what code executes:
// BAD - deployment and release are coupled
// To disable this, you have to redeploy
function checkout(cart) {
return newCheckoutFlow(cart);
}
// GOOD - deployment decoupled from release
async function checkout(cart) {
if (await flags.isEnabled('new-checkout-flow', { userId: cart.userId })) {
return newCheckoutFlow(cart);
}
return oldCheckoutFlow(cart);
}
Types of Feature Flags
Release Flags (most common)
Enable new features gradually:
// Roll out to 10% of users first, increase as confidence grows
const flag = {
name: 'new-dashboard',
enabled: true,
rollout: 10, // 10% of users
};
Kill Switches
Emergency off-switch for broken features:
// Payment service down? Disable instantly
if (!await flags.isEnabled('stripe-payments')) {
return { error: 'Payments temporarily unavailable' };
}
Experiment Flags (A/B Tests)
Test variants with real users:
const variant = await flags.getVariant('checkout-button-color', { userId });
// Returns 'blue' or 'green' based on configured split
track('checkout_shown', { variant: variant.name, userId });
Permission Flags
Control access by role or attribute:
// Only admins see the analytics dashboard
const flag = {
name: 'admin-analytics',
rules: [
{ attribute: 'role', operator: 'in', value: ['admin', 'analyst'] },
],
};
Building a Simple Flag System
You don't need a SaaS for basic flags. Here's a Redis-backed implementation:
// flags.js
const Redis = require('ioredis');
const redis = new Redis();
class FeatureFlags {
constructor() {
this.cache = new Map();
this.cacheTTL = 30000; // 30s local cache to avoid hammering Redis
}
async getFlag(name) {
const cacheKey = `flag:${name}`;
const cached = this.cache.get(cacheKey);
if (cached && Date.now() < cached.expiresAt) {
return cached.value;
}
const value = await redis.get(`feature:${name}`);
const flag = value ? JSON.parse(value) : null;
this.cache.set(cacheKey, {
value: flag,
expiresAt: Date.now() + this.cacheTTL,
});
return flag;
}
async isEnabled(flagName, context = {}) {
const flag = await this.getFlag(flagName);
if (!flag || !flag.enabled) return false;
// Check targeting rules
if (flag.rules?.length > 0) {
return this.evaluateRules(flag.rules, context);
}
// Percentage rollout
if (flag.rollout !== undefined) {
return this.isInRollout(
context.userId || context.sessionId,
flagName,
flag.rollout
);
}
return true;
}
isInRollout(id, flagName, percentage) {
if (!id) return Math.random() * 100 < percentage;
// Deterministic hash: same user always gets same result
const hash = this.hashString(`${flagName}:${id}`);
return (hash % 100) < percentage;
}
hashString(str) {
let hash = 0;
for (let i = 0; i < str.length; i++) {
hash = ((hash << 5) - hash) + str.charCodeAt(i);
hash = hash & hash;
}
return Math.abs(hash);
}
evaluateRules(rules, context) {
return rules.every(rule => {
const value = context[rule.attribute];
switch (rule.operator) {
case 'equals': return value === rule.value;
case 'in': return rule.value.includes(value);
case 'contains': return value?.includes(rule.value);
default: return false;
}
});
}
}
module.exports = new FeatureFlags();
# Python version
import redis
import json
import hashlib
r = redis.Redis()
def is_enabled(flag_name: str, context: dict = None) -> bool:
context = context or {}
raw = r.get(f"feature:{flag_name}")
if not raw:
return False
flag = json.loads(raw)
if not flag.get("enabled"):
return False
rollout = flag.get("rollout")
if rollout is not None:
user_id = context.get("user_id", "")
seed = f"{flag_name}:{user_id}".encode()
hash_val = int(hashlib.md5(seed).hexdigest(), 16)
return (hash_val % 100) < rollout
return True
Database Schema
CREATE TABLE feature_flags (
name TEXT PRIMARY KEY,
enabled BOOLEAN NOT NULL DEFAULT false,
rollout_percentage INTEGER CHECK (rollout_percentage BETWEEN 0 AND 100),
rules JSONB DEFAULT '[]',
description TEXT,
updated_at TIMESTAMPTZ DEFAULT NOW()
);
-- Audit log
CREATE TABLE feature_flag_history (
id SERIAL PRIMARY KEY,
flag_name TEXT NOT NULL,
action TEXT NOT NULL,
old_value JSONB,
new_value JSONB,
changed_by TEXT,
changed_at TIMESTAMPTZ DEFAULT NOW()
);
INSERT INTO feature_flags (name, enabled, rollout_percentage, description) VALUES
('new-checkout-flow', true, 10, 'New simplified checkout, rolling out gradually'),
('stripe-payments', true, 100, 'Kill switch for Stripe integration'),
('dark-mode', false, 0, 'Dark mode UI, not ready yet');
Gradual Rollout Pattern
// Week 1: Internal team only
await setFlag('new-feature', {
enabled: true,
rules: [{ attribute: 'email', operator: 'contains', value: '@yourcompany.com' }]
});
// Week 2: 5% of users
await setFlag('new-feature', { enabled: true, rollout: 5 });
// Week 3: 25% after no issues
await setFlag('new-feature', { enabled: true, rollout: 25 });
// Week 4: 100%
await setFlag('new-feature', { enabled: true, rollout: 100 });
// Week 5: Remove flag entirely, delete old code path
The Kill Switch Pattern
The most important use case. Every risky external dependency should have one:
async function sendEmail(to, subject, body) {
if (!await flags.isEnabled('sendgrid-email')) {
console.warn('Email disabled by feature flag:', subject);
return { skipped: true };
}
return await sendgrid.send({ to, subject, body });
}
async function processPayment(amount, card) {
if (!await flags.isEnabled('stripe-payments')) {
throw new Error('Payments temporarily unavailable');
}
return await stripe.charges.create({ amount, source: card });
}
When Stripe has an incident, flip the flag directly in Redis:
redis-cli SET feature:stripe-payments '{"enabled": false}'
# Immediate effect across all servers, no deploy needed
Avoiding Flag Debt
Flags accumulate. Old ones never get removed. Enforce hygiene:
// BAD - flag that should have been removed 2 years ago
if (await flags.isEnabled('launched-in-2023-feature')) {
// This is always true now, dead code
}
Add expiry to flags that have a natural end date:
const flag = {
name: 'holiday-banner-2026',
enabled: true,
expiresAt: '2026-01-05', // Auto-disable after this date
};
Monthly review:
# Find all feature flag references in code
grep -r "flags.isEnabled" src/ --include="*.js" | \
grep -oP '(?<=isEnabled\(["'"'"'])[^"'"'"']+' | \
sort -u > /tmp/code-flags.txt
# Compare with active flags in DB
psql -c "SELECT name FROM feature_flags WHERE enabled = true" > /tmp/db-flags.txt
# Flags in DB not in code = candidates for cleanup
diff /tmp/code-flags.txt /tmp/db-flags.txt
The Bottom Line
Feature flags decouple deployment from release. Ship code freely, release carefully.
Key points:
- Store flags in Redis or database—config files require a deploy to change
- Use deterministic hashing for rollouts so the same user always gets the same experience
- Add kill switches to every risky external dependency
- Track flag usage and delete old ones—flag debt is real
- Log flag evaluations for debugging ("why didn't they see the new feature?")
The first time you turn off a broken feature without a deploy, you'll wonder how you lived without this.