The year is 2002. The web is growing fast. Apache powers most of the internet. But a new problem is emerging: what happens when a single server needs to handle 10,000 simultaneous connections? This was called the C10K Problem, and Apache's architecture had a fundamental flaw that made it impossible to solve.
epollA default Linux thread has an 8 MB stack. 10,000 threads = 80 GB of virtual memory just for stacks before you've stored a single byte of request data. Even with smaller stacks, you're burning CPU on thousands of context switches per second — the OS scheduler spends more time switching between threads than those threads spend doing actual work. Nginx sidesteps this entirely by never blocking a thread in the first place.
Nginx starts as a single master process that manages everything. It reads config, binds ports, and spawns worker processes — one per CPU core by default. Workers do all the actual request handling. The master never touches live traffic.
nginx.conf:80, :443) — only master needs rootworker_processes)SIGHUP (reload config without downtime)sendfile()Send SIGHUP to the master (or run nginx -s reload). The master reads the new config, forks new worker processes running the new config, then tells old workers to stop accepting new connections and finish their existing ones gracefully. During the brief overlap, both old and new workers run simultaneously — zero dropped connections.
This is the core of Nginx's genius. A single worker process runs an infinite loop, asking the operating system: "which of my thousands of open connections are ready for I/O right now?" The OS answers (via epoll on Linux, kqueue on BSD) with a list of ready file descriptors. The worker processes only those — never waiting on anything.
Every HTTP request that arrives at Nginx passes through a fixed sequence of internal phases. Nginx's module system hooks into specific phases — this is how features like auth, rate limiting, gzip compression, and caching all plug in without touching the core.
Nginx picks the right server {} block by comparing the Host header against each server_name. Priority: exact match first, then *.wildcard, then wildcard.*, then regex. If nothing matches, uses the default_server.
= exact match — highest priority^~ prefix, stops regex if matched~ case-sensitive regex~* case-insensitive regexFilters run on the response before it's sent. They're chained — each filter reads from the previous one. Built-in filters: gzip (compress body), headers_filter (add/remove headers), sub (string substitution in body), ssi (server-side includes). Order matters: gzip must run after sub.
root or aliasThe most common production use of Nginx: sit in front of your app servers and forward requests to them. Clients never talk to your app directly. Nginx handles TLS, absorbs slow clients, adds/rewrites headers, and maintains a keepalive connection pool to upstreams so it's not opening a new TCP connection on every request.
X-Forwarded-For: <client-ip> — real client IP since upstream sees Nginx's IPX-Forwarded-Proto: https — tells app the original scheme was HTTPSX-Real-IP: <client-ip> — simplified single-IP headerHost: example.com — preserves original Host header (needs proxy_set_header Host $host)Without keepalive, Nginx opens a new TCP connection to the app server for every request. With keepalive 32 in the upstream block, Nginx maintains a pool of 32 persistent connections per worker. Requests reuse connections — no TCP handshake overhead on hot paths.
Nginx can distribute requests across multiple upstream servers using three built-in algorithms. Watch them in action — each incoming request dot flies from the queue to the server Nginx picks. Switch algorithms and see the distribution change in real time.
Default. Config: just list servers in upstream. Add weight=N to skew distribution. No state needed — O(1) per request.
Routes to the server with fewest active connections. Best when requests have wildly varying response times. Config: least_conn; in upstream block.
Hashes client IP → always same server for same client. Essential for session stickiness without a session store. Config: ip_hash; in upstream block.
Instead of every app server needing TLS certificates, keys, and the CPU cost of encryption, Nginx does it all in one place. The external world gets HTTPS. The internal network between Nginx and your app servers uses plain HTTP. This is called SSL termination — Nginx terminates the encrypted tunnel.
TLS handshakes are expensive (multiple round trips). Nginx maintains a session cache in shared memory across all worker processes. A returning client presents a session ticket — Nginx resumes without the full handshake. Configure with ssl_session_cache shared:SSL:10m — that's 10 MB shared across all workers, holding ~40,000 sessions.
Normally a browser checks a certificate's revocation status by querying the CA's OCSP server — adding latency to every new TLS connection. With OCSP stapling (ssl_stapling on), Nginx pre-fetches the OCSP response from the CA and "staples" it to the TLS handshake. Client gets revocation proof for free, with zero extra round trips.
When Nginx serves a static file, it uses the sendfile() system call instead of the traditional read-then-write approach. The difference: without sendfile, the file data crosses the kernel/user boundary twice. With sendfile, it stays entirely in the kernel — zero copies into user space. This is why Nginx can saturate a gigabit network interface serving static assets while using almost no CPU.
tcp_nopush (Linux: TCP_CORK) tells the kernel to buffer multiple sendfile chunks and send them in one TCP segment — fewer packets, better throughput for large files. tcp_nodelay disables Nagle's algorithm — sends the last partial segment immediately without waiting. Nginx enables both: cork during the bulk transfer, uncork to flush the tail. Best of both worlds.
Nginx was built from scratch to fix specific problems with Apache's architecture. They're not just different implementations of the same design — they make fundamentally different trade-offs.
| Attribute | Nginx | Apache |
|---|---|---|
| Concurrency model | Event-driven, non-blocking I/O. Fixed number of workers (one per CPU core). | prefork: one process per connection. worker: one thread per connection. event: hybrid (better but still heavier). |
| Memory usage at 10K connections | ~100–200 MB total (events are cheap) | prefork: ~10–80 GB (one process per conn). worker: ~8 GB. event: ~500 MB. |
| Config style | Declarative block-based. No per-directory runtime config. All config compiled once on load. | Supports .htaccess per-directory overrides — evaluated on every request. Flexible but slower. |
| Dynamic content | Always via external process (proxy_pass, fastcgi_pass). Nginx never runs PHP/Python natively. | Can run PHP/Perl directly in-process via mod_php, mod_perl. More tightly coupled but simpler setup. |
| Static file serving | Extremely fast via sendfile(). Benchmarks: Nginx is 2–4x faster than Apache for static assets. |
Good, but adds overhead from .htaccess checks, module chain, and process model. |
| Module loading | Modules compiled in at build time (or dynamic modules in Nginx 1.9.11+). No per-request module scanning. | Modules loaded as DSOs (LoadModule). Easy to add/remove without recompile. |
| Best for | High-concurrency reverse proxy, static file server, SSL termination, microservices front door, CDN edge. | Shared hosting, legacy PHP apps, complex per-directory rewrite rules, .htaccess-based configs. |
| Reverse proxy performance | Purpose-built. Upstream keepalive pool, buffering, upstream health checks are first-class features. | mod_proxy works but is not the primary use case. More memory overhead per proxied connection. |