How CDN Works — DevDunia

The Latency Problem: Physics Is Not Optional

Before CDNs existed, a single server somewhere in a data center served all users worldwide. That sounds fine — until you realize that the speed of light in fiber optic cable is roughly 200,000 km/s, which is about two-thirds the speed of light in a vacuum. And that limit is not a software problem you can optimize away. It is physics.

THE PHYSICS CONSTRAINT

Light in fiber: ~200,000 km/s

Mumbai to New York = ~12,000 km of cable
One-way trip = ~60ms of pure propagation
TCP requires a round-trip to establish: +120ms before the first byte
TLS handshake adds 1–2 more round-trips: +240–360ms

One page load = many round trips

HTML fetch: 1 RTT
CSS files: 2–5 RTTs
JS bundles: 5–15 RTTs
Images, fonts, API calls: 10–30 RTTs
Total: 10–50 RTTs × 120ms = 1.2–6 seconds from physics alone

The painful truth: a user in New York loading a page from a Mumbai server will wait over a second purely because of the speed of light — no amount of code optimization fixes this. The only solution is to move the data closer to the user. That's a CDN.

SINGLE ORIGIN SERVER — The Painful Round-Trip Reality

      These are propagation delays only — does not include queueing, TLS handshakes, or server processing time.
    

PHYSICS WINS MOVE THE DATA

CDN Architecture: Origin, Backbone, and Edge

A CDN is a three-layer system. Your origin server holds the truth. The CDN backbone is a private high-speed fiber network connecting data centers around the world. Edge nodes — also called PoPs (Points of Presence) — are the CDN's outposts in hundreds of cities. Users connect to the nearest PoP, not to your origin.

Origin Server

Your actual application
Holds the canonical data
Only CDN edge nodes talk to it (ideally)
Can be a single server or a cluster
Protected from direct user traffic

CDN Backbone

Private fiber between CDN data centers
Bypasses the public internet's congestion
Lower latency than BGP-routed internet
Handles inter-PoP traffic and origin fetches
Typically owned by the CDN provider

Edge PoP

CDN data center in a city
Caches content locally
Serves users with sub-10ms latency
Cloudflare: 300+ PoPs globally
Akamai: 4,200+ PoPs globally

CDN ARCHITECTURE — Origin + Edge PoPs Worldwide hover over a PoP to see its region

Origin Server (India)

Edge PoP nodes

CDN backbone connections

HOW A PoP IS STRUCTURED INSIDE

Each PoP is a small data center with: reverse proxy servers (Nginx/Varnish/custom) that handle TLS termination and cache lookups; solid-state disk storage for the local cache (typically terabytes); and BGP routers that advertise the CDN's anycast IP addresses. A large PoP might have dozens of servers behind a local load balancer.

Cache Miss vs Cache Hit: The Key Concept

This is the mechanism that makes CDNs work. The first request for a resource has to go all the way to origin — that's a cache miss. Every subsequent request for the same resource is served from the PoP's local cache — that's a cache hit. The latency difference is staggering.

FIRST REQUEST (CACHE MISS)

User in New York requests /hero.jpg

Hits NYC PoP — MISS — not in cache

PoP fetches from origin (Mumbai): +120ms

PoP stores response in local cache

Response returned to user — headers include X-Cache: MISS

~150ms total latency

SAME REQUEST AGAIN (CACHE HIT)

Same user (or different user) requests /hero.jpg

NYC PoP — HIT — served directly from SSD

Response returned — headers include X-Cache: HIT

~8ms total latency

THE 18x DIFFERENCE

Cache miss: ~150ms. Cache hit: ~8ms. That's an 18× speedup — and the math only gets better for users further from the origin. A Tokyo user hitting a Mumbai origin sees ~200ms; the same Tokyo user hitting a Tokyo PoP sees ~3ms. The CDN moves the disk read to 3ms away instead of 200ms away.

The fraction of requests served from cache is called the cache hit ratio. A good CDN config achieves 85–99% cache hit ratio. The remaining 1–15% are cache misses — new content, dynamically personalized pages, or cache expirations.

Cache-Control Headers: What Gets Cached and For How Long

CDNs respect HTTP cache headers that your origin server sends back. The most important is Cache-Control. This single header tells browsers, proxies, and CDN edges exactly how long to cache a response, under what conditions, and whether it can be shared across users.

RESPONSE HEADER ANATOMY

Cache-Control: public, max-age=86400, s-maxage=604800, stale-while-revalidate=3600

public

Response can be stored by any cache — browser, CDN edge, proxy. Without this, the CDN won't cache it.

Opposite: private (browser only, no CDN)

max-age=86400

Browser caches for 86,400 seconds (1 day). After this, browser re-validates with the CDN or origin.

86400 = 1 day · 604800 = 1 week · 31536000 = 1 year

s-maxage=604800

Overrides max-age for shared caches (CDN edges). CDN keeps it for 7 days even if browser TTL is 1 day.

"s-" = shared. Only CDN nodes respect this, not browsers.

stale-while-revalidate=3600

Serve stale content for up to 1 hour while fetching a fresh copy from origin in the background. Zero added latency on revalidation.

Key for avoiding cache stampede on popular content.

ETags and Conditional Requests

When a cached response expires, the CDN doesn't always need to re-download the full file. If the origin sent an ETag header (a hash of the file content), the CDN can send a conditional request:

CONDITIONAL REQUEST FLOW

      # CDN asks: "Is this still fresh?"

      GET /logo.png HTTP/1.1

      If-None-Match: "abc123def456"

      # Origin says: "Yep, still the same" (tiny response, no body)

      HTTP/1.1 304 Not Modified

      ETag: "abc123def456"

      # CDN extends its cache TTL. No bandwidth wasted re-downloading the file.

What CDNs Cache

Static assets with public directive
Images, JS, CSS, fonts, videos
HTML pages with explicit Cache-Control
API responses marked as public
Any response with s-maxage > 0

What CDNs Skip

Responses with private directive
Responses with no-store
Set-Cookie responses (usually)
Requests with Authorization headers
POST/PUT/DELETE requests (non-idempotent)

THE VARY HEADER TRAP

The Vary header tells CDNs to maintain separate cache entries for different request variants. Vary: Accept-Encoding is fine — CDN caches a gzip version and a br version. But Vary: User-Agent is a cache-buster: CDN creates a separate cache entry for every unique User-Agent string — there are thousands of them. This effectively destroys your cache hit ratio. Never vary on User-Agent.

Anycast Routing: How You Land on the Nearest PoP

Here's the magic trick: a CDN announces the exact same IP address from every PoP worldwide, simultaneously, using BGP (Border Gateway Protocol). When your device does a DNS lookup for assets.example.com, it gets back, say, 203.0.113.1. But BGP routes your actual connection to the geographically nearest PoP that is advertising that IP. No client-side logic needed.

HOW BGP ANYCAST WORKS

Traditional Unicast (one server, one IP)

IP 203.0.113.1 lives on one machine
All users worldwide connect to that machine
Routing is shortest path to that one destination

Anycast (many servers, same IP)

IP 203.0.113.1 is announced from every PoP
BGP routers globally route each packet to the nearest announcing PoP
User in London ends up at London PoP automatically

The Same IP, Four Locations

203.0.113.1

New York PoP

Announced to North American routers via AS12345

US/Canada users land here

203.0.113.1

Frankfurt PoP

Announced to European routers via AS12345

EU users land here

203.0.113.1

Singapore PoP

Announced to SEA routers via AS12345

SEA/India users land here

203.0.113.1

Tokyo PoP

Announced to APAC routers via AS12345

Japan/Korea users land here

WHY NOT USE DNS-BASED ROUTING?

Older CDNs used GeoDNS — returning a different IP per DNS lookup based on the resolver's location. Problem: DNS has TTLs, so you can't respond to network conditions in real time. BGP anycast is network-layer routing — routers continuously re-evaluate paths based on latency and reachability. If a PoP goes down, BGP automatically reroutes traffic to the next nearest PoP within seconds, without any DNS TTL waiting period.

Cache Invalidation: The Hard Problem

THE QUOTE

"There are only two hard problems in computer science: cache invalidation and naming things."

— Phil Karlton (made famous by Martin Fowler)

You've cached /app.js across 300 PoPs worldwide with a 7-day TTL. Then you push a bug fix. How do you update 300 servers instantly? That's cache invalidation — and there are three strategies in use today.

URL VERSIONING

Never change the URL of a file. Instead, embed the version in the filename or as a query parameter.

          # Old version (cached forever)

          /static/app.v3.4.1.js

          # New deploy — new URL

          /static/app.v3.4.2.js

Old URL stays cached forever — no problem
New URL is a brand new cache entry (miss once)
Works automatically with build tools (Vite, Webpack)

Best for: Static assets (JS, CSS, images)

PURGE API

Every CDN exposes an API to immediately invalidate specific URLs across all PoPs.

          # Cloudflare purge

          POST /zones/{id}/purge_cache

          { "files": ["/blog/post-123"] }

Immediate invalidation (~150ms propagation)
Use when content changes without URL change
Deploy hooks: trigger purge on CMS publish
Rate limits apply (don't purge millions of URLs)

Best for: CMS pages, news articles

SURROGATE KEYS

Tag cached responses with arbitrary keys. Purge all assets with a given tag in one API call.

          # Origin sends tag in header

          Surrogate-Key: product-42 category-shoes

          # Purge all tagged "product-42"

          POST /purge { "tag": "product-42" }

One purge call invalidates thousands of URLs
Supported by Fastly, Cloudflare (Cache-Tag)
Product update? Purge all pages containing it

Best for: E-commerce, data relationships

STALE-WHILE-REVALIDATE: THE HYBRID TRICK

stale-while-revalidate=N is the best of both worlds. When a cached response expires, instead of blocking the user while it fetches fresh content, the CDN serves the stale version immediately and kicks off a background refresh. Users see zero added latency. The trade-off: for up to N seconds after the TTL expires, users might see content that is one version old. For most content (blog posts, product pages), this is an acceptable trade-off for the latency win.

Edge Compute: Beyond Caching

Modern CDNs can run your code — JavaScript or WebAssembly — at the edge node, before the request ever reaches your origin. This is a paradigm shift: instead of caching pre-built responses, you run logic at the closest server to the user.

Cloudflare Workers

V8 isolates, not containers
Cold start: ~0ms (no JVM/Node boot)
JavaScript/TypeScript/Wasm
Runs at 300+ PoPs globally
Up to 30ms CPU per request

AWS Lambda@Edge

Runs at CloudFront PoPs
Node.js or Python
Cold start: 100–500ms (container-based)
Can modify request/response headers
Deeper AWS integration

Fastly Compute

WebAssembly-first (Rust, Go, JS)
Sub-millisecond cold starts
Strongest isolation model
Instant global deploys
Granular traffic control

Edge Compute Use Cases

A/B TESTING AT EDGE

Instead of sending users to origin to check which variant they're in, the edge Worker reads the cookie, assigns variant, and serves the correct cached HTML. Zero round-trips to origin. Zero latency added to A/B testing.

AUTH CHECK AT EDGE

Validate JWT tokens at the PoP before the request reaches origin. Protected routes bounce unauthenticated requests in ~5ms, before wasting origin resources. 98% of auth-blocked requests never touch your servers.

GEO REDIRECTS

Read the user's country from CDN request metadata and redirect to the correct regional domain — at the PoP. No origin round-trip needed for /eu/ vs /us/ routing.

IMAGE TRANSFORMS

Resize, compress, convert to WebP/AVIF at the edge based on the Accept header and device width. No pre-generated image variants needed on origin — the edge handles it on first request and caches the result.

The Latency Stack: Edge vs Origin Auth

Auth at Origin (Slow)

User sends request 0ms

CDN PoP (passes through) +5ms

Origin receives request +120ms

JWT validated on origin +5ms

Response returns to user +120ms

~250ms total

Auth at Edge (Fast)

User sends request 0ms

PoP Worker validates JWT +5ms

Authenticated req to origin +120ms

No re-auth needed at origin +0ms

Response returns to user +120ms

~245ms total (+rejected invalid: 5ms)

The latency saving for valid requests is small (5ms). The win is when requests are rejected: invalid auth is blocked in 5ms at the edge instead of sending a full 250ms round-trip to origin only to be rejected there. For high-traffic APIs under load, this means your origin only sees authenticated traffic.

Interactive: CDN Request Simulator

Select a user location below, then fire cache-miss and cache-hit requests. Watch the animated dot travel through the network, see the latency counter tick up, and observe which PoP serves the request.

CDN REQUEST SIMULATOR — Pick a User, Fire a Request origin server: Mumbai

SELECT USER LOCATION

NEAREST PoP — select a user —

CACHE STATUS —

RESPONSE TIME

—

CDN Provider Comparison

Provider	PoPs	Best For	Purge Speed	Edge Compute
Cloudflare	330+	General purpose, DDoS protection, Workers at edge, free tier	~150ms global	Workers (V8 isolates, ~0ms cold start)
AWS CloudFront	600+ (incl. regional edge)	AWS-native apps, S3/EC2 origins, deep IAM integration	1–10 seconds	Lambda@Edge (Node.js/Python, 100ms+ cold start)
Fastly	88+	Real-time purge, streaming media, Surrogate-Key invalidation	~150ms instant	Compute@Edge (Wasm, sub-ms cold start)
Akamai	4,200+	Enterprise, media streaming, highest PoP density worldwide	~5 seconds	EdgeWorkers (JavaScript)

END OF HOW-10