Before CDNs existed, a single server somewhere in a data center served all users worldwide. That sounds fine — until you realize that the speed of light in fiber optic cable is roughly 200,000 km/s, which is about two-thirds the speed of light in a vacuum. And that limit is not a software problem you can optimize away. It is physics.
The painful truth: a user in New York loading a page from a Mumbai server will wait over a second purely because of the speed of light — no amount of code optimization fixes this. The only solution is to move the data closer to the user. That's a CDN.
A CDN is a three-layer system. Your origin server holds the truth. The CDN backbone is a private high-speed fiber network connecting data centers around the world. Edge nodes — also called PoPs (Points of Presence) — are the CDN's outposts in hundreds of cities. Users connect to the nearest PoP, not to your origin.
Each PoP is a small data center with: reverse proxy servers (Nginx/Varnish/custom) that handle TLS termination and cache lookups; solid-state disk storage for the local cache (typically terabytes); and BGP routers that advertise the CDN's anycast IP addresses. A large PoP might have dozens of servers behind a local load balancer.
This is the mechanism that makes CDNs work. The first request for a resource has to go all the way to origin — that's a cache miss. Every subsequent request for the same resource is served from the PoP's local cache — that's a cache hit. The latency difference is staggering.
/hero.jpg
X-Cache: MISS
/hero.jpg
X-Cache: HIT
Cache miss: ~150ms. Cache hit: ~8ms. That's an 18× speedup — and the math only gets better for users further from the origin. A Tokyo user hitting a Mumbai origin sees ~200ms; the same Tokyo user hitting a Tokyo PoP sees ~3ms. The CDN moves the disk read to 3ms away instead of 200ms away.
The fraction of requests served from cache is called the cache hit ratio. A good CDN config achieves 85–99% cache hit ratio. The remaining 1–15% are cache misses — new content, dynamically personalized pages, or cache expirations.
CDNs respect HTTP cache headers that your origin server sends back. The most important is Cache-Control. This single header tells browsers, proxies, and CDN edges exactly how long to cache a response, under what conditions, and whether it can be shared across users.
private (browser only, no CDN)max-age for shared caches (CDN edges). CDN keeps it for 7 days even if browser TTL is 1 day.When a cached response expires, the CDN doesn't always need to re-download the full file. If the origin sent an ETag header (a hash of the file content), the CDN can send a conditional request:
public directives-maxage > 0private directiveno-storeThe Vary header tells CDNs to maintain separate cache entries for different request variants. Vary: Accept-Encoding is fine — CDN caches a gzip version and a br version. But Vary: User-Agent is a cache-buster: CDN creates a separate cache entry for every unique User-Agent string — there are thousands of them. This effectively destroys your cache hit ratio. Never vary on User-Agent.
Here's the magic trick: a CDN announces the exact same IP address from every PoP worldwide, simultaneously, using BGP (Border Gateway Protocol). When your device does a DNS lookup for assets.example.com, it gets back, say, 203.0.113.1. But BGP routes your actual connection to the geographically nearest PoP that is advertising that IP. No client-side logic needed.
203.0.113.1 lives on one machine203.0.113.1 is announced from every PoPAnnounced to North American routers via AS12345
Announced to European routers via AS12345
Announced to SEA routers via AS12345
Announced to APAC routers via AS12345
Older CDNs used GeoDNS — returning a different IP per DNS lookup based on the resolver's location. Problem: DNS has TTLs, so you can't respond to network conditions in real time. BGP anycast is network-layer routing — routers continuously re-evaluate paths based on latency and reachability. If a PoP goes down, BGP automatically reroutes traffic to the next nearest PoP within seconds, without any DNS TTL waiting period.
"There are only two hard problems in computer science: cache invalidation and naming things."
— Phil Karlton (made famous by Martin Fowler)
You've cached /app.js across 300 PoPs worldwide with a 7-day TTL. Then you push a bug fix. How do you update 300 servers instantly? That's cache invalidation — and there are three strategies in use today.
Never change the URL of a file. Instead, embed the version in the filename or as a query parameter.
Best for: Static assets (JS, CSS, images)
Every CDN exposes an API to immediately invalidate specific URLs across all PoPs.
Best for: CMS pages, news articles
Tag cached responses with arbitrary keys. Purge all assets with a given tag in one API call.
Best for: E-commerce, data relationships
stale-while-revalidate=N is the best of both worlds. When a cached response expires, instead of blocking the user while it fetches fresh content, the CDN serves the stale version immediately and kicks off a background refresh. Users see zero added latency. The trade-off: for up to N seconds after the TTL expires, users might see content that is one version old. For most content (blog posts, product pages), this is an acceptable trade-off for the latency win.
Modern CDNs can run your code — JavaScript or WebAssembly — at the edge node, before the request ever reaches your origin. This is a paradigm shift: instead of caching pre-built responses, you run logic at the closest server to the user.
Instead of sending users to origin to check which variant they're in, the edge Worker reads the cookie, assigns variant, and serves the correct cached HTML. Zero round-trips to origin. Zero latency added to A/B testing.
Validate JWT tokens at the PoP before the request reaches origin. Protected routes bounce unauthenticated requests in ~5ms, before wasting origin resources. 98% of auth-blocked requests never touch your servers.
Read the user's country from CDN request metadata and redirect to the correct regional domain — at the PoP. No origin round-trip needed for /eu/ vs /us/ routing.
Resize, compress, convert to WebP/AVIF at the edge based on the Accept header and device width. No pre-generated image variants needed on origin — the edge handles it on first request and caches the result.
The latency saving for valid requests is small (5ms). The win is when requests are rejected: invalid auth is blocked in 5ms at the edge instead of sending a full 250ms round-trip to origin only to be rejected there. For high-traffic APIs under load, this means your origin only sees authenticated traffic.
Select a user location below, then fire cache-miss and cache-hit requests. Watch the animated dot travel through the network, see the latency counter tick up, and observe which PoP serves the request.
| Provider | PoPs | Best For | Purge Speed | Edge Compute |
|---|---|---|---|---|
| Cloudflare | 330+ | General purpose, DDoS protection, Workers at edge, free tier | ~150ms global | Workers (V8 isolates, ~0ms cold start) |
| AWS CloudFront | 600+ (incl. regional edge) | AWS-native apps, S3/EC2 origins, deep IAM integration | 1–10 seconds | Lambda@Edge (Node.js/Python, 100ms+ cold start) |
| Fastly | 88+ | Real-time purge, streaming media, Surrogate-Key invalidation | ~150ms instant | Compute@Edge (Wasm, sub-ms cold start) |
| Akamai | 4,200+ | Enterprise, media streaming, highest PoP density worldwide | ~5 seconds | EdgeWorkers (JavaScript) |