Design a Food Delivery System
Design a food delivery system (DoorDash/Uber Eats) with three-sided marketplace, order tracking, driver dispatch, and ETA calculation.
Problem Statement
Design a food delivery platform like DoorDash that connects customers, restaurants, and delivery drivers. The system must handle restaurant discovery and menus, order placement and preparation tracking, driver dispatch and route optimization, and real-time delivery tracking with accurate ETAs.
Requirements
Functional
- Browse nearby restaurants with menus, ratings, and estimated delivery times
- Place orders: select items, apply promotions, pay, and track preparation status
- Dispatch the optimal driver based on proximity, current load, and restaurant wait time
- Real-time order tracking: preparation progress, driver location, and dynamic ETA
Non-Functional
- Latency: Restaurant search <500ms, order placement <3 seconds, location updates every 4 seconds
- Scale: 30M orders/day at peak, 5M active drivers, 1M restaurants
- ETA accuracy: Within 5 minutes of actual delivery time 90% of the time
- Availability: 99.95% -- missed orders directly hurt revenue
Core Architecture
-
Restaurant and Menu Service -- Manages restaurant profiles, menus, operating hours, and real-time availability (can toggle items off when sold out). Menus are cached in Redis. Restaurant search uses Elasticsearch with geospatial filtering (within delivery radius) and ranked by relevance (user preference, rating, delivery time estimate, promotion status).
-
Order Orchestration Service -- Manages the order lifecycle: PLACED -> CONFIRMED_BY_RESTAURANT -> PREPARING -> READY_FOR_PICKUP -> PICKED_UP -> DELIVERED. Uses a state machine with transitions triggered by restaurant tablet events, driver app events, and timeout rules (auto-cancel if restaurant does not confirm within 5 minutes). Each state change publishes to Kafka for downstream consumers.
-
Driver Dispatch Engine -- When an order is near ready (estimated 5 minutes before pickup), the dispatch engine selects the best driver using a scoring function: distance_to_restaurant * 0.4 + current_order_count * 0.3 + driver_rating * 0.2 + acceptance_rate * 0.1. Sends the offer to the top-ranked driver; if declined within 30 seconds, offers to the next. Uses a QuadTree of active driver locations for fast proximity lookups.
-
ETA Prediction Service -- Computes delivery ETA as: food_preparation_time + driver_travel_to_restaurant + driver_travel_to_customer. Prep time is predicted by an ML model trained on historical order data (cuisine type, order size, current restaurant load, time of day). Travel time uses a routing engine with real-time traffic data. ETA is recalculated every 30 seconds and pushed to the customer via WebSocket.
Database Choice
PostgreSQL for orders, users, restaurants, and driver profiles -- ACID transactions for order state machine transitions. Redis for driver location QuadTree, restaurant menu cache, and real-time order status. Cassandra for driver location history and delivery tracking events (high write throughput, time-series pattern). Kafka for order events consumed by ETA, notifications, analytics, and driver assignment services.
Key API Endpoints
POST /api/v1/orders
-> Body: \{ restaurant_id: "R-45", items: [\{ menu_item_id: "MI-7", qty: 2 \}], address_id: "A-3", payment_method_id: "PM-1" \}
-> Returns: \{ order_id: "ORD-123", status: "PLACED", eta_min: 35 \}
WebSocket /ws/orders/\{order_id\}/track
-> Server pushes: \{ status: "PICKED_UP", driver_lat: 37.78, driver_lng: -122.41, eta_min: 12 \}
POST /api/v1/drivers/\{driver_id\}/location
-> Body: \{ lat: 37.775, lng: -122.418, timestamp: ... \}
Scaling Insight
Batched multi-order dispatch dramatically improves efficiency. Instead of dispatching each order independently (greedy approach), the system batches orders arriving within a 2-minute window and solves the assignment as an optimization problem: minimize total driver travel distance across all pending orders. This Hungarian algorithm-based approach reduces average delivery time by 15% and increases driver utilization, because a driver finishing a nearby delivery can be routed to pick up the next order on the way back.
Key Tradeoffs
| Decision | Option A | Option B | Chosen |
|---|---|---|---|
| Dispatch | Greedy (assign immediately) | Batched optimization (2-min window) | Batched -- 15% better ETAs, higher driver utilization, slight delay acceptable |
| ETA model | Simple distance/speed formula | ML model with historical data | ML model -- accounts for prep time variance, traffic, and restaurant-specific patterns |
| Order tracking | Polling (client polls every 10s) | WebSocket push | WebSocket -- real-time updates, lower server load (no repeated polling requests) |
Practical Implementation for .NET Developers
In a .NET application, you would typically implement this pattern using the following approach:
ASP.NET Core setup: Create a service class that encapsulates the logic, register it with dependency injection, and inject it into your controllers or minimal API endpoints. The built-in DI container handles lifecycle management.
Entity Framework Core: For database interactions, EF Core provides the ORM layer. Use migrations for schema management and raw SQL for performance-critical queries. Consider Dapper for read-heavy paths where EF Core's overhead matters.
Azure integration: If deploying to Azure, leverage managed services — Azure Cache for Redis, Azure SQL, Azure Service Bus, Azure Cosmos DB. These eliminate operational overhead and provide built-in monitoring through Application Insights.
Testing: Use xUnit with Testcontainers for integration tests that spin up real databases in Docker. Mock external dependencies with NSubstitute. The WebApplicationFactory class lets you test your entire HTTP pipeline in-process.
Monitoring: Add Application Insights telemetry to track request latency, dependency calls, and custom metrics. Use structured logging with Serilog to make production debugging possible:
Log.Information("Processing order {OrderId} for {CustomerId}", orderId, customerId);
This gives you searchable, structured logs in Azure Monitor or Seq.
Deep-Dive: Clarifying Questions for Food Delivery
- What is the order volume? DoorDash handles ~2 million orders per day. Uber Eats handles similar volumes. Peak hours (11:30 AM - 1:30 PM, 5:30 PM - 9:00 PM) see 3-5x average traffic.
- How does order matching work? Match orders to delivery drivers based on proximity, route optimization (batching multiple orders for one driver), and restaurant preparation time.
- How accurate must ETA prediction be? Users expect ETAs within 5 minutes of actual delivery time. ETA depends on: restaurant prep time (varies wildly), driver travel time (real-time traffic), and queuing at the restaurant.
- Do we need real-time tracking? Users want to see their driver's location on a map in real-time, similar to Uber.
- How do we handle restaurant menu synchronization? Menus change frequently (sold-out items, daily specials). How do we keep them in sync?
- Multi-restaurant orders? Can a user order from two restaurants in one delivery? This significantly complicates routing and coordination.
Specific Functional Requirements
- Restaurant Discovery: Browse and search restaurants by cuisine, rating, delivery time, and distance with real-time availability
- Menu and Ordering: View restaurant menus with real-time item availability, customize items, and place orders
- Order Matching: Automatically assign orders to optimal delivery drivers based on proximity, current load, and route efficiency
- Real-Time Tracking: Show the driver's live location on a map from restaurant pickup to customer delivery
- ETA Prediction: Provide accurate delivery time estimates using ML models trained on historical data, traffic, and restaurant prep times
- Payment: Process payment including food cost, delivery fee, service fee, and tip with tax calculation
- Rating and Reviews: Rate restaurants, food items, and delivery drivers after each order
Specific API Endpoints
GET /api/v1/restaurants?lat=37.77&lng=-122.41&cuisine=pizza&sort=eta
Response: { "restaurants": [{ "id": "r_123", "name": "Joe's Pizza", "eta_minutes": 35, "rating": 4.7, "delivery_fee": 299 }] }
POST /api/v1/orders
Body: { "restaurant_id": "r_123", "items": [{ "item_id": "i_456", "quantity": 2, "customizations": ["extra cheese"] }], "delivery_address": {...}, "tip": 500 }
Response: { "order_id": "o_789", "status": "confirmed", "eta_minutes": 40, "total": 3499 }
WebSocket /ws/orders/:order_id/track
Messages: { "status": "driver_assigned", "driver": { "name": "Bob", "lat": 37.78, "lng": -122.40 }, "eta_minutes": 25 }
Status flow: confirmed -> preparing -> ready_for_pickup -> driver_assigned -> picked_up -> arriving -> delivered
GET /api/v1/orders/:order_id
Response: { "order_id": "o_789", "status": "picked_up", "items": [...], "driver": {...}, "timeline": [...] }
Specific Data Model
Restaurants (PostgreSQL)
| Column | Type | Notes |
|---|---|---|
| restaurant_id | UUID | Primary key |
| name | VARCHAR | |
| location | POINT | PostGIS geography type for geospatial queries |
| cuisine_types | ARRAY | ["pizza", "italian"] |
| avg_prep_time_minutes | INT | Learned from historical orders |
| rating | DECIMAL(2,1) | 1.0 - 5.0 |
| operating_hours | JSONB | Per-day open/close times |
| is_accepting_orders | BOOLEAN | Real-time toggle |
Orders (PostgreSQL, sharded by order_id)
| Column | Type | Notes |
|---|---|---|
| order_id | UUID | Primary key |
| customer_id | BIGINT | |
| restaurant_id | UUID | |
| driver_id | BIGINT | Nullable until assigned |
| status | ENUM | confirmed, preparing, ready, picked_up, delivered, cancelled |
| items | JSONB | Ordered items with customizations |
| subtotal | INT | In cents |
| delivery_fee | INT | |
| tip | INT | |
| estimated_delivery_at | TIMESTAMP | |
| actual_delivery_at | TIMESTAMP |
Driver Locations (Redis with geospatial): GEOADD drivers LNG LAT DRIVER_ID — enables GEORADIUS queries for finding nearby available drivers within seconds.
Specific Back-of-the-Envelope Numbers
Traffic:
- 2 million orders/day, peak of 200K orders/hour during dinner rush
- Average order touches: 1 restaurant notification, 3-5 driver match attempts, 50+ location updates during delivery
- Location updates from active drivers: 100K active drivers * 1 update/4 seconds = 25K location writes/second
Order lifecycle timing:
- Order placement to restaurant confirmation: under 30 seconds
- Restaurant prep time: 15-45 minutes (varies by restaurant and order complexity)
- Driver matching: under 60 seconds
- Delivery: 5-30 minutes depending on distance
Storage:
- Orders: 2M/day * 2KB = 4 GB/day = 1.5 TB/year
- Driver location history: 25K/sec * 50 bytes * 86,400 = 108 GB/day (retain 7 days for ETA model training)
- Restaurant menus: 500K restaurants * 50 items * 500 bytes = 12.5 GB (relatively small)
ETA prediction:
- Features: distance, time of day, day of week, restaurant historical prep time, current order queue length, weather, traffic conditions
- Model inference: under 50ms per prediction
- Retrained daily on the previous day's actual delivery times