How DNS Actually Works

TL;DR

DNS is a distributed hierarchy of servers. Your OS checks cache, then asks a resolver, which walks the tree from root → TLD → authoritative server. TTL controls caching. Understanding this saves hours of debugging.

I spent an embarrassing amount of time treating DNS as a black box. Then I hit a propagation issue at 2am, read the actual RFC, and it clicked. Here's what I wish I'd understood earlier.

What DNS Does

DNS translates example.com into an IP address. That's the whole job. The complexity is in the distributed system that makes this work without a central database.

The Hierarchy

.                          ← Root (13 root server clusters worldwide)
├── com.                   ← TLD (run by Verisign)
│   ├── example.com.       ← Authoritative (run by whoever owns the domain)
│   └── github.com.
├── net.
└── io.

Every domain name is actually read right-to-left. example.com. has a trailing dot (the root). Most clients add it automatically.

What Happens When You Visit a Site

You type: example.com

1. OS checks /etc/hosts          → not there
2. OS checks local DNS cache     → miss (or expired)
3. OS asks your recursive resolver (usually your router or 8.8.8.8)

4. Resolver checks its cache     → miss
5. Resolver asks a root server: "Who handles .com?"
   Root: "Ask the .com TLD servers at these IPs"

6. Resolver asks TLD server: "Who handles example.com?"
   TLD: "Ask example.com's authoritative server at these IPs"

7. Resolver asks authoritative server: "What's the IP for example.com?"
   Auth: "93.184.216.34, TTL 3600"

8. Resolver caches the answer for 3600 seconds, returns it to you
9. Your OS caches it. Browser caches it. You get the IP.

The whole thing takes 50-200ms on a cold cache, under 1ms on a warm one.

DNS Record Types

A        example.com → 93.184.216.34          (IPv4)
AAAA     example.com → 2606:2800:220:1::93    (IPv6)
CNAME    www → example.com                    (alias, not an IP)
MX       example.com → mail.example.com (10)  (mail routing)
TXT      "v=spf1 include:..."                 (arbitrary text, SPF/DKIM live here)
NS       example.com → ns1.nameserver.com     (who's authoritative)
SOA      Zone metadata, primary NS, serial    (admin stuff)

CNAME chains are real and slow. Each hop is another lookup.

TTL Controls Everything

# Check TTL on a record
dig example.com

# Output:
# example.com.    3600   IN   A   93.184.216.34
#                  ^^^^
#                  This record is cached for 3600 seconds (1 hour)

# Watch TTL count down from cache:
dig example.com +noall +answer
# Second call shows lower TTL if your resolver is caching

When you change DNS records, the old ones live in caches until TTL expires. To speed up propagation: lower TTL before you make the change. Set it to 300 (5 min) a day ahead of time.

Debugging DNS

# Basic lookup
dig example.com

# Trace the full resolution path
dig example.com +trace

# Ask a specific server directly (bypass your resolver's cache)
dig @8.8.8.8 example.com
dig @1.1.1.1 example.com

# Check what your system resolves (uses /etc/hosts + system resolver)
nslookup example.com

# Check DNS propagation across global resolvers
dig @8.8.8.8 example.com    # Google
dig @1.1.1.1 example.com    # Cloudflare
dig @9.9.9.9 example.com    # Quad9

# Reverse lookup (IP → hostname)
dig -x 93.184.216.34

# Check MX records
dig example.com MX

# Check TXT records (SPF, DKIM, domain verification)
dig example.com TXT

# Short output
dig example.com +short
# 93.184.216.34

The +trace Output Explained

dig example.com +trace

# Shows:
# . (root) → which .com TLD servers to ask
# com. (TLD) → which example.com authoritative servers to ask
# example.com. → the actual A record

# Every hop shows which server responded and how long it took

This is the tool for debugging "my DNS change isn't propagating." You can see exactly which server is returning old data.

Common Problems

The CNAME Trap

# This is fine:
www.example.com   CNAME   example.com
example.com       A       93.184.216.34

# This breaks mail:
example.com       CNAME   something.else.com
# RFC says: apex domain cannot be a CNAME if you have MX records
# Some providers work around this with CNAME flattening / ALIAS records

Split-Horizon DNS

Your internal DNS returns 10.0.1.5 for api.example.com. External DNS returns the public IP. When debugging: always confirm which resolver you're using.

# What does my system resolve?
dig api.example.com

# What does the internet resolve?
dig @8.8.8.8 api.example.com

Negative Caching

NXDOMAIN (domain doesn't exist) is also cached, according to the SOA record's minimum field. If you create a record that previously didn't exist, you may need to wait for negative cache TTL to expire.

DNS Over HTTPS (DoH) and DNS Over TLS (DoT)

Traditional DNS is plaintext on UDP port 53. Your ISP can see every domain you look up.

DoH: DNS queries sent over HTTPS (port 443) → encrypted, looks like web traffic
DoT: DNS queries sent over TLS (port 853)   → encrypted, but identifiable as DNS

Cloudflare (1.1.1.1) and Google (8.8.8.8) both support both. Modern browsers do DoH by default.

The Bottom Line

DNS is a distributed cache with a hierarchical lookup fallback. The resolver does the work; your client just asks and waits.

The rules:

  • Lower TTL before making changes, not after
  • Use dig +trace to see exactly which server is returning what
  • Query @8.8.8.8 to bypass local resolver cache when debugging
  • Apex domains can't be CNAMEs — use ALIAS/ANAME records at the root
  • NXDOMAIN caches too — new records take time to become visible

Understanding the resolution path takes DNS from magic to mechanical.