01

Why You Can't Just Loop Through All Drivers

Here's the first idea every engineer has: when a rider requests a trip, scan all active drivers, compute the distance to each one, sort by distance, return the top 10. Sounds fine in your head. Let's run the numbers and watch it collapse.

THE NAIVE APPROACH IN PRODUCTION

Uber at peak:

  • ~5 million drivers online globally
  • Location updates every 4 seconds
  • ~100,000 trip requests per minute
  • Target latency: < 100ms per search

Naive linear scan cost:

  • 5M Haversine calculations per request
  • Each Haversine ≈ 3 trig calls (expensive)
  • 100K req/min × 5M ops = 500 billion ops/min
  • You need approximately 10,000 servers just for search

The only sane approach: use a spatial index — a data structure that can answer "find all points within X km of this coordinate" in O(log n) or better, without touching the bulk of the dataset. Uber uses three overlapping systems: Geohash for the simple case, Redis Geo for the hot path, and H3 hexagons for analytics and supply positioning. Let's build up each one.

O(N) IS NOT A PLAN SPATIAL INDEX OR BUST
02

Geohash: Encoding Space as a String

Geohash is a clever trick: convert a (lat, lng) coordinate into a short string, where two strings that share a prefix are guaranteed to be geographically close. "ttnfv" and "ttnfw" are next-door neighbors. "ttnfv" and "s0" are on different continents. This lets you do proximity queries with string comparisons.

The Encoding Algorithm

1

Binary-split the bounding box

Start with the whole world. Longitude: -180° → 180°. Latitude: -90° → 90°. Now repeatedly halve each range. If your value is in the right half, encode a 1. Left half → 0. Alternate between longitude bits (even positions) and latitude bits (odd positions).

2

Interleave lon/lat bits

You end up with a bit string where even positions encode longitude and odd positions encode latitude. For Delhi (77.2090°E, 28.6139°N): the bit string starts 11010011100110011000... The interleaving is key — it creates a Z-order (Morton) space-filling curve that preserves locality.

3

Encode every 5 bits as Base32

Group 5 bits at a time and map to this alphabet: 0123456789bcdefghjkmnpqrstuvwxyz. Each character covers 5 bits → 25 = 32 possible values (no 'a', 'i', 'l', 'o' to avoid confusion). Result: Delhi → ttnfvk at precision 6.

4

Longer prefix = smaller cell = more precise

Each extra character divides the cell into 32 sub-cells. Longer geohash → tighter area → more precision → larger index storage.

Precision Cheat Sheet

PrecisionCell WidthCell HeightUse CaseDelhi Example
15,009 km4,992 kmCountryt
21,252 km624 kmAsia regiontt
3156 km156 kmMetro areattn
439 km20 kmCityttnf
54.9 km4.9 kmNeighborhoodttnfv
61.2 km0.6 kmStreet blockttnfvk
7152 m152 mBuildingttnfvke
838 m19 mExact spotttnfvkes
🗂 LIVE GEOHASH ENCODER — Watch the Bits Get Woven
LONGITUDE BITS (even positions):
LATITUDE BITS (odd positions):
INTERLEAVED (lon at even, lat at odd):
GEOHASH RESULT
——
Each character = 5 bits → Base32 alphabet
03

Interactive: Geohash Grid Visualizer

Click anywhere on the city grid to drop a pickup point. Watch how the geohash cell is identified, the 8 neighboring cells are highlighted, and drivers in the search zone are found — all before a single distance calculation runs.

GEOHASH CELL EXPLORER — precision 6 click grid to place pickup
PICKUP COORDS Click on the map
GEOHASH (prec 6)
CELLS SEARCHED
DRIVERS FOUND
REDIS COMMAND
04

The Boundary Problem: Why One Cell Isn't Enough

Here's the flaw that would sink a naive geohash implementation. A rider and a driver could be 10 meters apart but in completely different geohash cells. Imagine a cell boundary running right down the middle of Outer Ring Road in Delhi. The driver is 9 meters to the left. The rider is 9 meters to the right. Same cell? No. Distance? 18 meters. But if you only search the rider's cell, you completely miss that driver.

THE SOLUTION: ALWAYS SEARCH 9 CELLS

Always search the center cell + all 8 neighbors. A geohash cell at precision 6 is ~1.2km × 0.6km. Its 8 neighbors form a 3×3 grid covering ~3.6km × 1.8km. Any point within that outer ring is at most ~2km away from the center — which is your initial search radius. After the 9-cell sweep, you still run distance calculations, but only on the small set of candidates from those cells (maybe 20–50 drivers), not the millions of drivers total.

❌ Search 1 Cell Only
  • Miss drivers right across the cell boundary
  • A driver 15m away may be invisible
  • Results are incomplete and non-deterministic
  • Worse near city borders (many cells = many boundaries)
  • Unacceptable for production
✅ Search Center + 8 Neighbors
  • Guaranteed coverage of the entire radius zone
  • 9 Redis lookups (still sub-millisecond each)
  • Candidate set stays small (typically 20–80 drivers)
  • Then run exact Haversine distance filter on candidates
  • This is what every production system does
ALSO WATCH OUT FOR: THE WRAPPING EDGE CASE

Geohash at precision 1 shows an ugly discontinuity: cells on the far-left edge of the map (longitude = -180) and cells on the far-right edge (longitude = +180) are adjacent in space but not adjacent in the geohash string. Same for poles. This is a known limitation. In practice, Uber routes around this by using a different spatial index for polar/edge regions (which have essentially zero ridership anyway).

05

Redis Geo Index: The Production Hot Path

Redis has native geo commands built on top of sorted sets. Internally, Redis encodes each (lat, lng) as a 52-bit geohash stored as the score in a ZSET. This gives you ~0.6mm precision — more than enough. The commands feel like magic but are just sorted set operations under the hood.

The Three Commands You Actually Need

REDIS GEO LIFECYCLE — 3 Phases Animated
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ PHASE 3: Driver goes offline / starts trip ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ # Remove from available set when trip starts ZREM drivers:available:DEL "d_42" # Add to in-trip set (for tracking, not proximity search) GEOADD drivers:on_trip:DEL \ 77.2090 28.6139 "d_42"
REDIS GEO INTERNALS: HOW IT REALLY WORKS

Redis GEOADD doesn't store coordinates directly. It calls ZADD internally with a 52-bit integer geohash as the score. GEORADIUS then computes a bounding box from the radius, converts to geohash score ranges, and runs ZRANGEBYSCORE — which is just a binary search on the sorted set. That's why it's O(N+log(M)) where N is results and M is total members. Extremely fast. The distance filtering is done in-process after the set scan returns candidates.

Handling Horizontal Scaling

🗃 CITY-BASED SHARDING — Each Shard Owns a City
SHARD 1 — North India
drivers:available:DEL
drivers:available:GGN
drivers:available:NOI
~190K drivers · 95MB
SHARD 2 — West India
drivers:available:MUM
drivers:available:PNE
drivers:available:HYD
~280K drivers · 140MB
SHARD 3 — South India
drivers:available:BLR
drivers:available:CHN
drivers:available:HYD
~190K drivers · 95MB
Rule: A rider in Mumbai never queries Shard 1. City geofence polygon (not bounding box) determines which shard a driver belongs to. Large cities like Delhi NCR can further sub-shard by quadrant: DEL:NW, DEL:NE, DEL:SE, DEL:SW.
06

Interactive: Live Driver Location Simulator

Watch 30 drivers moving around the city in real time. Click anywhere on the map to drop a pickup point, then hit Search Drivers to see the GEORADIUS query execute — with the Redis commands streaming in the terminal.

UBER SUPPLY SERVICE — Location Dispatcher click map to drop pickup
0
ONLINE
0
IN RADIUS
0
ON TRIP
QUERY ms
# Waiting for pickup location...
# Click the map to begin
WHAT HAPPENS IN REAL UBER
07

H3: Uber's Hexagonal Grid (What They Actually Built)

In 2018 Uber open-sourced H3 — a hierarchical hexagonal grid system that covers the entire Earth. Why hexagons? Because in a hex grid, every neighbor is equidistant from the center. In a square grid, diagonal neighbors are √2 ≈ 1.41× farther than edge neighbors. That asymmetry causes systematic bias in any proximity calculation.

Square Grid Problems
  • Edge neighbors: distance = 1 unit
  • Corner neighbors: distance = 1.41 units
  • → Searching a circle by grid cells is unequal — corner cells cover less of the circle
  • → Drivers in corner cells are systematically deprioritized vs edge-cell drivers at same true distance
Hexagonal Grid Advantages
  • All 6 neighbors: exactly equidistant
  • → No systematic bias in proximity search
  • → Better approximation of a circle (max error 13% vs 41% for squares)
  • → Natural for k-ring expansion (ring of k steps = concentric hexagons)
  • → Cleaner for supply/demand density maps

H3 Resolution Table

ResolutionHex Edge LengthHex AreaTotal Cells (Earth)Uber Use
r=01,107 km4,250,547 km²122Continent
r=422 km1,770 km²288,122Metro region
r=71.2 km5.16 km²98MNeighborhood
r=9174 m0.105 km²2.4BDriver pickup zone ✦
r=1065.9 m0.015 km²16.8BPrecise location
r=129.4 m0.0003 km²822BSub-building
INTERACTIVE H3 K-RING EXPANDER
RING STATS
Ring 0: 1 hex
Ring 1: 6 hexes
Ring 2: 12 hexes
Ring 3: 18 hexes
Formula: k×6 per ring
Total at k: 1+6k(k+1)/2
CURRENT SELECTION
k-ring: 0
Hexes: 1
Drivers found:
Radius: ~175m
H3 vs GEORADIUS
GEORADIUS → hot path
(one call, exact dist)

H3 → analytics path
(density maps, surge
pricing, reposition)
08

The Full Location Pipeline Architecture

Here's how all the pieces connect in production. Three distinct data flows — GPS ingestion, trip request, driver state change — all hitting different services with different consistency requirements.

GPS INGESTION PATH — Every 4 seconds per driver
📱 Driver App
GPS ping
H3 encoded on device
──▶
⚖️ Load Balancer
HTTPS POST
/location
──▶
🔧 Location Service
Go · 200 instances
validates + routes
──▶
⚡ Redis Cluster
GEOADD
drivers:avail:{city}
📨 Kafka
"driver-locations"
ML · H3 · analytics
TRIP REQUEST PATH — <50ms SLA
🙋 Rider App
POST /request-trip
pickup + dest coords
──▶
📡 Demand Svc
GEORADIUS 2km
top 10 candidates
──▶
⚡ Redis Cluster
returns
10 driver_ids + dist
──▶
🚦 Dispatch Svc
parallel lookup:
ETA · rating · type
──▶
✅ Offer Top 3
simultaneously
first accept wins

Key Design Decisions Explained

DecisionWhyAlternative Considered
H3 encoded on device (driver app) Saves 1ms of encoding on servers at 1.25M GPS updates/second Server-side encoding: simpler but doesn't scale
Redis GEORADIUS as hot path (not geohash lookup) GEORADIUS returns exact distances in one call; geohash requires 9 lookups + manual distance filter Custom geohash index: ~2× more code, marginal speed gain
Kafka between Location Service and analytics Decouples high-frequency writes from analytics consumers; replay capability for ML retraining Direct DB write: couples ingest latency to analytics query latency
Two separate Redis sets (available vs on_trip) Trip request only searches available — set is 10× smaller. No wasted compute on busy drivers. Status field + filtered scan: requires reading all drivers then filtering
Offer to top 3 simultaneously If top driver doesn't respond in 10s, next offer is already pending — reduces rider wait time Sequential offer: simpler, but each rejection adds 10s to rider wait
09

Production Numbers & What You'd Actually Build

Uber Scale (2024)

Active drivers~5 million globally
GPS updates/second~1.25 million/s
Trip requests/minute~100,000
GEORADIUS p99 latency< 5ms
End-to-end search SLA< 50ms
Redis sets per city2 (available + on_trip)
Redis memory per city~50MB for 50K drivers
Search radiusDynamic: 0.5–8 km based on supply density

What You'd Build at 1% of Scale

  • Single Redis instance handles up to 1M GEOADD/s — you'd never need a cluster until you're truly big
  • A single GEORADIUS call on a 100K-driver ZSET takes < 1ms
  • One Redis node (r7g.2xlarge on AWS, ~$0.40/hr) handles Uber ops at India scale at a fraction of their scale
  • Start with city partitioning only when you can't fit in one node
  • H3 analytics: only needed once you have rich enough data for supply positioning
THE PRACTICAL APPROACH

Start with GEOADD + GEORADIUS. Add city sharding when your Redis memory exceeds 80%. Add H3 analytics when you have enough trip history to optimize driver positioning. Don't build the full architecture on day 1.

GEOADD ✦ GEORADIUS 50ms OR LESS SCALE IT LATER
END OF HOW-07