Design Airbnb
Design Airbnb with search (dates + location), booking system, pricing engine, and reviews.
Problem Statement
Design a vacation rental marketplace like Airbnb where hosts list properties and guests search by location, dates, and filters, book stays, and leave reviews. The system must handle complex availability queries (intersection of date ranges with location), prevent double-bookings, and support dynamic pricing.
Requirements
Functional
- Search listings by location (map viewport), check-in/check-out dates, guest count, and filters (price, amenities, property type)
- Book a listing for specific dates with instant booking or host approval flow
- Dynamic pricing: hosts set base price; system suggests prices based on demand, seasonality, and local events
- Reviews: guests and hosts review each other after checkout (double-blind reveal)
Non-Functional
- Latency: Search results in <1 second with 20+ results per page
- Consistency: Double-booking prevention requires strong consistency on reservations
- Scale: 7M active listings, 150M users, 2M bookings/day, peak 500K concurrent searches
- Availability: 99.99% for search, 99.95% for booking
Core Architecture
-
Search Service -- Combines geospatial and temporal filtering. Uses Elasticsearch with geo_bounding_box for location and a custom availability check. Pre-indexed listings include amenities and property attributes. Availability is checked against a calendar service for the top 200 candidates (not all listings). Results are ranked by relevance, price, and listing quality score.
-
Availability Calendar Service -- Each listing has a calendar stored in PostgreSQL: one row per date per listing (listing_id, date, status: available/booked/blocked, price). Availability queries use:
WHERE listing_id = ? AND date BETWEEN ? AND ? AND status = 'available'. Count of available dates must equal the stay length. Bookings lock the date range with SELECT ... FOR UPDATE. -
Booking Service -- Orchestrates the reservation: (1) Check availability with row-level locks, (2) Create reservation record with status PENDING, (3) Charge guest via payment gateway, (4) Update calendar dates to BOOKED, (5) Notify host. If payment fails, release locks and mark reservation CANCELLED. Uses idempotency keys to prevent duplicate bookings from retries.
- Pricing Engine -- Suggests nightly prices to hosts using an ML model trained on: local demand (search volume for area + dates), seasonality (holidays, events), comparable listings' occupancy rates, and day-of-week patterns. Hosts can accept, adjust, or ignore suggestions. Smart pricing auto-adjusts daily if enabled.
Database Choice
PostgreSQL for listings, users, reservations, reviews, and the availability calendar -- ACID transactions essential for booking integrity. The calendar table is partitioned by listing_id for locality. Elasticsearch for search -- geospatial queries with faceted filtering (price ranges, amenities, property type). Redis for search result caching (keyed by query hash, 60-second TTL) and session storage. S3 + CDN for listing photos.
Key API Endpoints
GET /api/v1/search?lat=37.77&lng=-122.42&checkin=2024-03-01&checkout=2024-03-05&guests=2&min_price=50&max_price=200
-> Returns: \{ listings: [\{ id, title, price_per_night, rating, thumbnail_url, coordinates \}], total: 342 \}
POST /api/v1/reservations
-> Body: \{ listing_id: "L-123", checkin: "2024-03-01", checkout: "2024-03-05", guests: 2, payment_method_id: "PM-7" \}
-> Returns: \{ reservation_id: "R-456", status: "CONFIRMED", total: 680.00 \}
POST /api/v1/reviews
-> Body: \{ reservation_id: "R-456", rating: 5, comment: "Amazing stay!" \}
Scaling Insight
The two-phase search strategy is essential for performance. Phase 1: Elasticsearch returns the top 200 listings matching location and attribute filters (sub-100ms, no availability check). Phase 2: the Availability Calendar Service checks date availability only for these 200 candidates (parallel batch query to PostgreSQL). This avoids the impossible task of checking availability for all 7M listings on every search and keeps end-to-end search latency under 1 second.
Key Tradeoffs
| Decision | Option A | Option B | Chosen |
|---|---|---|---|
| Availability storage | Bitmap per listing (compact) | Row per date per listing (flexible) | Row per date -- enables per-night pricing, blocked dates, and simple SQL queries |
| Search + availability | Single query (Elasticsearch + calendar) | Two-phase (ES then calendar) | Two-phase -- keeps ES index simple, calendar check only for top candidates |
| Booking model | Instant booking only | Instant + host approval | Both -- instant for most listings, approval for premium/long-stay to give hosts control |
Practical Implementation for .NET Developers
In a .NET application, you would typically implement this pattern using the following approach:
ASP.NET Core setup: Create a service class that encapsulates the logic, register it with dependency injection, and inject it into your controllers or minimal API endpoints. The built-in DI container handles lifecycle management.
Entity Framework Core: For database interactions, EF Core provides the ORM layer. Use migrations for schema management and raw SQL for performance-critical queries. Consider Dapper for read-heavy paths where EF Core's overhead matters.
Azure integration: If deploying to Azure, leverage managed services — Azure Cache for Redis, Azure SQL, Azure Service Bus, Azure Cosmos DB. These eliminate operational overhead and provide built-in monitoring through Application Insights.
Testing: Use xUnit with Testcontainers for integration tests that spin up real databases in Docker. Mock external dependencies with NSubstitute. The WebApplicationFactory class lets you test your entire HTTP pipeline in-process.
Monitoring: Add Application Insights telemetry to track request latency, dependency calls, and custom metrics. Use structured logging with Serilog to make production debugging possible:
Log.Information("Processing order {OrderId} for {CustomerId}", orderId, customerId);
This gives you searchable, structured logs in Azure Monitor or Seq.
System-Specific Clarifying Questions
Before designing Airbnb, ask questions specific to THIS system:
- Who are the primary users? Understanding the user base shapes every technical decision — consumer apps have different requirements than enterprise B2B systems.
- What is the read-to-write ratio? This determines whether you optimize for fast reads (caching, denormalization) or fast writes (write-ahead logs, async processing).
- What is the geographic distribution? Users in one country vs. global users fundamentally changes your data replication and CDN strategy.
- What is the acceptable latency? Some features need sub-100ms responses, others can tolerate seconds. This determines your caching and architecture strategy.
- What is the consistency requirement? Some data (payments, inventory) needs strong consistency. Other data (social feeds, recommendations) can be eventually consistent.
Architecture Deep Dive
The architecture for Airbnb should be designed around the specific access patterns of the system. Do not apply generic templates — every system has unique hotspots, bottlenecks, and scaling challenges.
Write Path: How does data enter the system? Is it bursty (event-driven, flash sales) or steady (sensor data, logs)? Bursty writes need queuing and backpressure. Steady writes can go directly to the database.
Read Path: How is data consumed? Is it fan-out (one write, many reads like social feeds) or point lookups (one read for specific data like user profiles)? Fan-out reads benefit from pre-computation and caching. Point lookups benefit from efficient indexing.
Hot Spots: Where are the bottlenecks? For Airbnb, identify the component that will fail first under load and design mitigation strategies: caching, sharding, rate limiting, or async processing.