01

The Latency Problem: Physics Is Not Optional

Before CDNs existed, a single server somewhere in a data center served all users worldwide. That sounds fine — until you realize that the speed of light in fiber optic cable is roughly 200,000 km/s, which is about two-thirds the speed of light in a vacuum. And that limit is not a software problem you can optimize away. It is physics.

THE PHYSICS CONSTRAINT

Light in fiber: ~200,000 km/s

  • Mumbai to New York = ~12,000 km of cable
  • One-way trip = ~60ms of pure propagation
  • TCP requires a round-trip to establish: +120ms before the first byte
  • TLS handshake adds 1–2 more round-trips: +240–360ms

One page load = many round trips

  • HTML fetch: 1 RTT
  • CSS files: 2–5 RTTs
  • JS bundles: 5–15 RTTs
  • Images, fonts, API calls: 10–30 RTTs
  • Total: 10–50 RTTs × 120ms = 1.2–6 seconds from physics alone

The painful truth: a user in New York loading a page from a Mumbai server will wait over a second purely because of the speed of light — no amount of code optimization fixes this. The only solution is to move the data closer to the user. That's a CDN.

SINGLE ORIGIN SERVER — The Painful Round-Trip Reality
ORIGIN Mumbai New York London Tokyo ~240ms RTT ~140ms RTT ~110ms RTT Every dashed line = a painful round-trip. Without CDN, every user waits the full distance.
These are propagation delays only — does not include queueing, TLS handshakes, or server processing time.
PHYSICS WINS MOVE THE DATA
02

CDN Architecture: Origin, Backbone, and Edge

A CDN is a three-layer system. Your origin server holds the truth. The CDN backbone is a private high-speed fiber network connecting data centers around the world. Edge nodes — also called PoPs (Points of Presence) — are the CDN's outposts in hundreds of cities. Users connect to the nearest PoP, not to your origin.

Origin Server
  • Your actual application
  • Holds the canonical data
  • Only CDN edge nodes talk to it (ideally)
  • Can be a single server or a cluster
  • Protected from direct user traffic
CDN Backbone
  • Private fiber between CDN data centers
  • Bypasses the public internet's congestion
  • Lower latency than BGP-routed internet
  • Handles inter-PoP traffic and origin fetches
  • Typically owned by the CDN provider
Edge PoP
  • CDN data center in a city
  • Caches content locally
  • Serves users with sub-10ms latency
  • Cloudflare: 300+ PoPs globally
  • Akamai: 4,200+ PoPs globally
CDN ARCHITECTURE — Origin + Edge PoPs Worldwide hover over a PoP to see its region
Origin Server (India)
Edge PoP nodes
CDN backbone connections
HOW A PoP IS STRUCTURED INSIDE

Each PoP is a small data center with: reverse proxy servers (Nginx/Varnish/custom) that handle TLS termination and cache lookups; solid-state disk storage for the local cache (typically terabytes); and BGP routers that advertise the CDN's anycast IP addresses. A large PoP might have dozens of servers behind a local load balancer.

03

Cache Miss vs Cache Hit: The Key Concept

This is the mechanism that makes CDNs work. The first request for a resource has to go all the way to origin — that's a cache miss. Every subsequent request for the same resource is served from the PoP's local cache — that's a cache hit. The latency difference is staggering.

FIRST REQUEST (CACHE MISS)
User in New York requests /hero.jpg
Hits NYC PoP — MISS — not in cache
PoP fetches from origin (Mumbai): +120ms
PoP stores response in local cache
Response returned to user — headers include X-Cache: MISS
~150ms total latency
SAME REQUEST AGAIN (CACHE HIT)
Same user (or different user) requests /hero.jpg
NYC PoP — HIT — served directly from SSD
Response returned — headers include X-Cache: HIT
~8ms total latency
THE 18x DIFFERENCE

Cache miss: ~150ms. Cache hit: ~8ms. That's an 18× speedup — and the math only gets better for users further from the origin. A Tokyo user hitting a Mumbai origin sees ~200ms; the same Tokyo user hitting a Tokyo PoP sees ~3ms. The CDN moves the disk read to 3ms away instead of 200ms away.

The fraction of requests served from cache is called the cache hit ratio. A good CDN config achieves 85–99% cache hit ratio. The remaining 1–15% are cache misses — new content, dynamically personalized pages, or cache expirations.

04

Cache-Control Headers: What Gets Cached and For How Long

CDNs respect HTTP cache headers that your origin server sends back. The most important is Cache-Control. This single header tells browsers, proxies, and CDN edges exactly how long to cache a response, under what conditions, and whether it can be shared across users.

RESPONSE HEADER ANATOMY
Cache-Control: public, max-age=86400, s-maxage=604800, stale-while-revalidate=3600
public
Response can be stored by any cache — browser, CDN edge, proxy. Without this, the CDN won't cache it.
Opposite: private (browser only, no CDN)
max-age=86400
Browser caches for 86,400 seconds (1 day). After this, browser re-validates with the CDN or origin.
86400 = 1 day · 604800 = 1 week · 31536000 = 1 year
s-maxage=604800
Overrides max-age for shared caches (CDN edges). CDN keeps it for 7 days even if browser TTL is 1 day.
"s-" = shared. Only CDN nodes respect this, not browsers.
stale-while-revalidate=3600
Serve stale content for up to 1 hour while fetching a fresh copy from origin in the background. Zero added latency on revalidation.
Key for avoiding cache stampede on popular content.

ETags and Conditional Requests

When a cached response expires, the CDN doesn't always need to re-download the full file. If the origin sent an ETag header (a hash of the file content), the CDN can send a conditional request:

CONDITIONAL REQUEST FLOW
# CDN asks: "Is this still fresh?"
GET /logo.png HTTP/1.1
If-None-Match: "abc123def456"

# Origin says: "Yep, still the same" (tiny response, no body)
HTTP/1.1 304 Not Modified
ETag: "abc123def456"

# CDN extends its cache TTL. No bandwidth wasted re-downloading the file.
What CDNs Cache
  • Static assets with public directive
  • Images, JS, CSS, fonts, videos
  • HTML pages with explicit Cache-Control
  • API responses marked as public
  • Any response with s-maxage > 0
What CDNs Skip
  • Responses with private directive
  • Responses with no-store
  • Set-Cookie responses (usually)
  • Requests with Authorization headers
  • POST/PUT/DELETE requests (non-idempotent)
THE VARY HEADER TRAP

The Vary header tells CDNs to maintain separate cache entries for different request variants. Vary: Accept-Encoding is fine — CDN caches a gzip version and a br version. But Vary: User-Agent is a cache-buster: CDN creates a separate cache entry for every unique User-Agent string — there are thousands of them. This effectively destroys your cache hit ratio. Never vary on User-Agent.

05

Anycast Routing: How You Land on the Nearest PoP

Here's the magic trick: a CDN announces the exact same IP address from every PoP worldwide, simultaneously, using BGP (Border Gateway Protocol). When your device does a DNS lookup for assets.example.com, it gets back, say, 203.0.113.1. But BGP routes your actual connection to the geographically nearest PoP that is advertising that IP. No client-side logic needed.

HOW BGP ANYCAST WORKS

Traditional Unicast (one server, one IP)

  • IP 203.0.113.1 lives on one machine
  • All users worldwide connect to that machine
  • Routing is shortest path to that one destination

Anycast (many servers, same IP)

  • IP 203.0.113.1 is announced from every PoP
  • BGP routers globally route each packet to the nearest announcing PoP
  • User in London ends up at London PoP automatically

The Same IP, Four Locations

203.0.113.1
New York PoP

Announced to North American routers via AS12345

US/Canada users land here
203.0.113.1
Frankfurt PoP

Announced to European routers via AS12345

EU users land here
203.0.113.1
Singapore PoP

Announced to SEA routers via AS12345

SEA/India users land here
203.0.113.1
Tokyo PoP

Announced to APAC routers via AS12345

Japan/Korea users land here
WHY NOT USE DNS-BASED ROUTING?

Older CDNs used GeoDNS — returning a different IP per DNS lookup based on the resolver's location. Problem: DNS has TTLs, so you can't respond to network conditions in real time. BGP anycast is network-layer routing — routers continuously re-evaluate paths based on latency and reachability. If a PoP goes down, BGP automatically reroutes traffic to the next nearest PoP within seconds, without any DNS TTL waiting period.

06

Cache Invalidation: The Hard Problem

THE QUOTE

"There are only two hard problems in computer science: cache invalidation and naming things."

— Phil Karlton (made famous by Martin Fowler)

You've cached /app.js across 300 PoPs worldwide with a 7-day TTL. Then you push a bug fix. How do you update 300 servers instantly? That's cache invalidation — and there are three strategies in use today.

URL VERSIONING

Never change the URL of a file. Instead, embed the version in the filename or as a query parameter.

# Old version (cached forever)
/static/app.v3.4.1.js
# New deploy — new URL
/static/app.v3.4.2.js
  • Old URL stays cached forever — no problem
  • New URL is a brand new cache entry (miss once)
  • Works automatically with build tools (Vite, Webpack)

Best for: Static assets (JS, CSS, images)

PURGE API

Every CDN exposes an API to immediately invalidate specific URLs across all PoPs.

# Cloudflare purge
POST /zones/{id}/purge_cache
{ "files": ["/blog/post-123"] }
  • Immediate invalidation (~150ms propagation)
  • Use when content changes without URL change
  • Deploy hooks: trigger purge on CMS publish
  • Rate limits apply (don't purge millions of URLs)

Best for: CMS pages, news articles

SURROGATE KEYS

Tag cached responses with arbitrary keys. Purge all assets with a given tag in one API call.

# Origin sends tag in header
Surrogate-Key: product-42 category-shoes

# Purge all tagged "product-42"
POST /purge { "tag": "product-42" }
  • One purge call invalidates thousands of URLs
  • Supported by Fastly, Cloudflare (Cache-Tag)
  • Product update? Purge all pages containing it

Best for: E-commerce, data relationships

STALE-WHILE-REVALIDATE: THE HYBRID TRICK

stale-while-revalidate=N is the best of both worlds. When a cached response expires, instead of blocking the user while it fetches fresh content, the CDN serves the stale version immediately and kicks off a background refresh. Users see zero added latency. The trade-off: for up to N seconds after the TTL expires, users might see content that is one version old. For most content (blog posts, product pages), this is an acceptable trade-off for the latency win.

07

Edge Compute: Beyond Caching

Modern CDNs can run your code — JavaScript or WebAssembly — at the edge node, before the request ever reaches your origin. This is a paradigm shift: instead of caching pre-built responses, you run logic at the closest server to the user.

Cloudflare Workers
  • V8 isolates, not containers
  • Cold start: ~0ms (no JVM/Node boot)
  • JavaScript/TypeScript/Wasm
  • Runs at 300+ PoPs globally
  • Up to 30ms CPU per request
AWS Lambda@Edge
  • Runs at CloudFront PoPs
  • Node.js or Python
  • Cold start: 100–500ms (container-based)
  • Can modify request/response headers
  • Deeper AWS integration
Fastly Compute
  • WebAssembly-first (Rust, Go, JS)
  • Sub-millisecond cold starts
  • Strongest isolation model
  • Instant global deploys
  • Granular traffic control

Edge Compute Use Cases

A/B TESTING AT EDGE

Instead of sending users to origin to check which variant they're in, the edge Worker reads the cookie, assigns variant, and serves the correct cached HTML. Zero round-trips to origin. Zero latency added to A/B testing.

AUTH CHECK AT EDGE

Validate JWT tokens at the PoP before the request reaches origin. Protected routes bounce unauthenticated requests in ~5ms, before wasting origin resources. 98% of auth-blocked requests never touch your servers.

GEO REDIRECTS

Read the user's country from CDN request metadata and redirect to the correct regional domain — at the PoP. No origin round-trip needed for /eu/ vs /us/ routing.

IMAGE TRANSFORMS

Resize, compress, convert to WebP/AVIF at the edge based on the Accept header and device width. No pre-generated image variants needed on origin — the edge handles it on first request and caches the result.

The Latency Stack: Edge vs Origin Auth

Auth at Origin (Slow)
User sends request 0ms
CDN PoP (passes through) +5ms
Origin receives request +120ms
JWT validated on origin +5ms
Response returns to user +120ms
~250ms total
Auth at Edge (Fast)
User sends request 0ms
PoP Worker validates JWT +5ms
Authenticated req to origin +120ms
No re-auth needed at origin +0ms
Response returns to user +120ms
~245ms total (+rejected invalid: 5ms)

The latency saving for valid requests is small (5ms). The win is when requests are rejected: invalid auth is blocked in 5ms at the edge instead of sending a full 250ms round-trip to origin only to be rejected there. For high-traffic APIs under load, this means your origin only sees authenticated traffic.

08

Interactive: CDN Request Simulator

Select a user location below, then fire cache-miss and cache-hit requests. Watch the animated dot travel through the network, see the latency counter tick up, and observe which PoP serves the request.

CDN REQUEST SIMULATOR — Pick a User, Fire a Request origin server: Mumbai
SELECT USER LOCATION
NEAREST PoP — select a user —
CACHE STATUS
RESPONSE TIME

CDN Provider Comparison

Provider PoPs Best For Purge Speed Edge Compute
Cloudflare 330+ General purpose, DDoS protection, Workers at edge, free tier ~150ms global Workers (V8 isolates, ~0ms cold start)
AWS CloudFront 600+ (incl. regional edge) AWS-native apps, S3/EC2 origins, deep IAM integration 1–10 seconds Lambda@Edge (Node.js/Python, 100ms+ cold start)
Fastly 88+ Real-time purge, streaming media, Surrogate-Key invalidation ~150ms instant Compute@Edge (Wasm, sub-ms cold start)
Akamai 4,200+ Enterprise, media streaming, highest PoP density worldwide ~5 seconds EdgeWorkers (JavaScript)
END OF HOW-10