Design Tinder
Design Tinder with geospatial matching, swipe queue generation, mutual like detection, and real-time chat. Covers GeoHash indexing and match algorithms.
Problem Statement
Design a location-based dating app like Tinder where users swipe right (like) or left (pass) on profiles. The system must generate a queue of nearby potential matches based on location, age, and gender preferences, detect mutual likes to create matches, and enable real-time chat between matched users.
Requirements
Functional
- Generate a swipe queue of nearby profiles filtered by distance (1-100 mi), age range, and gender preference
- Record swipe decisions (like/pass); detect mutual likes and create a match instantly
- Real-time chat between matched users via WebSocket
- Update user location periodically (every 5 minutes when app is open)
Non-Functional
- Latency: Swipe queue loads in <500ms with 50+ profiles prefetched
- Scale: 75M MAU, 10M DAU, 2B swipes/day, 50M matches/day
- Freshness: Location updates reflected in recommendations within 5 minutes
- Privacy: Exact location never exposed; only distance shown
Core Architecture
-
Recommendation Engine -- Generates the swipe queue per user. Queries a geospatial index (GeoHash in Redis) for users within the specified radius, filters by age/gender preferences, excludes already-swiped profiles (bloom filter), and ranks by a compatibility score (profile completeness, activity recency, Elo-like desirability score).
-
Swipe Processor -- Records each swipe in Kafka, then checks for mutual like: when user A likes user B, look up whether B already liked A (Redis SET lookup). If mutual, create a match record and push a notification to both users. Processes 25K swipes/second.
-
GeoHash Location Index -- Users' locations are stored as GeoHash strings in Redis (GEOADD). When a user opens the app, their location is updated. Nearby user queries use GEORADIUS with a configurable radius. GeoHash precision level 5 (~5km cells) balances accuracy with query performance.
- Chat Service -- WebSocket-based messaging between matched users. Messages are persisted in Cassandra (partitioned by match_id, sorted by timestamp). Typing indicators and read receipts are handled in-memory at the WebSocket gateway without persistence.
Database Choice
PostgreSQL for user profiles, preferences, and match records -- relational queries for profile lookups and match history. Redis with GEO commands for location indexing -- GEOADD/GEORADIUS provide O(log N + M) proximity queries where M is the result count. Cassandra for swipe history (write-heavy: 2B/day) and chat messages. A Bloom filter (in Redis) per user tracks already-seen profiles to prevent showing the same person twice -- 1 MB per user supports 10M entries with 1% false positive rate.
Key API Endpoints
GET /api/v1/recommendations?limit=50
-> Returns: \{ profiles: [\{ user_id: "U1", name: "...", photos: [...], distance_mi: 3.2, age: 28 \}] \}
POST /api/v1/swipe
-> Body: \{ target_user_id: "U2", direction: "right" \}
-> Returns: \{ match: true, match_id: "M-789" \} (or \{ match: false \})
POST /api/v1/messages/\{match_id\}
-> Body: \{ text: "Hey!" \}
Scaling Insight
The Bloom filter for deduplication is the unsung hero. Without it, generating the swipe queue requires checking every candidate against the user's entire swipe history (potentially 100K+ entries). A 1 MB Bloom filter per active user provides O(1) "already seen?" checks with negligible false positive rate. For 10M DAU, this costs only 10 TB of Redis memory -- well worth the elimination of expensive database lookups on the hot path.
Key Tradeoffs
| Decision | Option A | Option B | Chosen |
|---|---|---|---|
| Geo index | PostGIS (SQL-based) | Redis GeoHash (in-memory) | Redis GeoHash -- sub-ms lookups, better for the real-time swipe queue hot path |
| Match detection | Batch check (every N minutes) | Real-time on each swipe | Real-time -- instant match notification is core to the product experience |
| Queue generation | On-demand per request | Pre-computed batch | Pre-computed with on-demand refresh -- amortizes expensive geo queries, queues refilled async |
Practical Implementation for .NET Developers
In a .NET application, you would typically implement this pattern using the following approach:
ASP.NET Core setup: Create a service class that encapsulates the logic, register it with dependency injection, and inject it into your controllers or minimal API endpoints. The built-in DI container handles lifecycle management.
Entity Framework Core: For database interactions, EF Core provides the ORM layer. Use migrations for schema management and raw SQL for performance-critical queries. Consider Dapper for read-heavy paths where EF Core's overhead matters.
Azure integration: If deploying to Azure, leverage managed services — Azure Cache for Redis, Azure SQL, Azure Service Bus, Azure Cosmos DB. These eliminate operational overhead and provide built-in monitoring through Application Insights.
Testing: Use xUnit with Testcontainers for integration tests that spin up real databases in Docker. Mock external dependencies with NSubstitute. The WebApplicationFactory class lets you test your entire HTTP pipeline in-process.
Monitoring: Add Application Insights telemetry to track request latency, dependency calls, and custom metrics. Use structured logging with Serilog to make production debugging possible:
Log.Information("Processing order {OrderId} for {CustomerId}", orderId, customerId);
This gives you searchable, structured logs in Azure Monitor or Seq.
System-Specific Clarifying Questions
Before designing Tinder, ask questions specific to THIS system:
- Who are the primary users? Understanding the user base shapes every technical decision — consumer apps have different requirements than enterprise B2B systems.
- What is the read-to-write ratio? This determines whether you optimize for fast reads (caching, denormalization) or fast writes (write-ahead logs, async processing).
- What is the geographic distribution? Users in one country vs. global users fundamentally changes your data replication and CDN strategy.
- What is the acceptable latency? Some features need sub-100ms responses, others can tolerate seconds. This determines your caching and architecture strategy.
- What is the consistency requirement? Some data (payments, inventory) needs strong consistency. Other data (social feeds, recommendations) can be eventually consistent.
Architecture Deep Dive
The architecture for Tinder should be designed around the specific access patterns of the system. Do not apply generic templates — every system has unique hotspots, bottlenecks, and scaling challenges.
Write Path: How does data enter the system? Is it bursty (event-driven, flash sales) or steady (sensor data, logs)? Bursty writes need queuing and backpressure. Steady writes can go directly to the database.
Read Path: How is data consumed? Is it fan-out (one write, many reads like social feeds) or point lookups (one read for specific data like user profiles)? Fan-out reads benefit from pre-computation and caching. Point lookups benefit from efficient indexing.
Hot Spots: Where are the bottlenecks? For Tinder, identify the component that will fail first under load and design mitigation strategies: caching, sharding, rate limiting, or async processing.