medium7 min readUpdated 2026-06-08

Design a Message Queue

Design a distributed message queue with topics, partitions, consumer groups, ordering guarantees, and at-least-once delivery.

Problem Statement

Design a distributed message queue (like Kafka) that enables decoupled, asynchronous communication between producer and consumer services. The system must support topics with partitions for parallel consumption, consumer groups for load distribution, configurable delivery guarantees (at-least-once, at-most-once, exactly-once), and message ordering within partitions.

Requirements

System architecture diagram for Design a Message Queue showing how services, databases, and caches connect — System architecture for Design a Message Queue

Functional

Producers publish messages to named topics; messages are routed to partitions by key (or round-robin if no key)
Consumer groups: each partition is consumed by exactly one consumer in a group; rebalanced on consumer join/leave
Message ordering guaranteed within a partition (not across partitions)
Configurable retention: time-based (e.g., 7 days) or size-based (e.g., 1 TB per partition)

Non-Functional

Throughput: 1M messages/second per topic, 10M aggregate across all topics
Latency: <10ms end-to-end (producer publish to consumer receive) at p99
Durability: No message loss -- messages replicated to 3 brokers before acknowledgment
Scale: 10,000+ topics, 100,000+ partitions across a cluster

Core Architecture

Step-by-step diagram showing how Design a Message Queue processes a request from start to finish — How Design a Message Queue works step by step

Broker Cluster -- Each broker manages a set of partitions. A partition is an ordered, append-only log stored on disk. Writes are sequential (disk-friendly), achieving 500 MB/s per broker. Each partition has a leader broker (handles reads/writes) and 2 follower replicas (sync writes for durability).
Partition Manager -- Assigns partitions to brokers using a balanced placement algorithm. When a broker fails, reassigns its leader partitions to in-sync replicas (ISR) within seconds. Uses ZooKeeper (or KRaft in newer designs) for leader election and metadata management.
Consumer Group Coordinator -- Tracks consumer membership per group. When consumers join or leave, triggers a rebalance: reassigns partitions to consumers using a sticky assignment strategy (minimizes partition movement). Each consumer commits its offset (position in the log) periodically to track progress.

Data flow diagram for Design a Message Queue showing how requests and responses move through the system — Data flow through Design a Message Queue

Producer Client -- Batches messages (configurable by size or time), compresses batches (lz4/snappy), and sends to the leader broker for each partition. Supports configurable acks: acks=0 (fire-and-forget), acks=1 (leader only), acks=all (all ISR replicas). Retries on failure with idempotent producer IDs to prevent duplicates.

Database Choice

No external database for message storage -- messages are stored in the brokers' local filesystem as append-only segment files. Each segment is a sequential log file (e.g., 1 GB) with an index file mapping offsets to file positions. This exploits sequential disk I/O and OS page cache for near-memory-speed reads. ZooKeeper/KRaft for cluster metadata (broker registry, topic configs, partition assignments, consumer group offsets). Consumer offsets are stored in a special internal topic (__consumer_offsets).

Interview preparation checklist for Design a Message Queue with key points to mention and mistakes to avoid — Interview tips for Design a Message Queue

Key API Endpoints

text

POST /api/v1/topics/\{topic\}/messages
  -> Body: \{ key: "user-123", value: "\{order_id: 456\}", headers: \{\} \}
  -> Returns: \{ partition: 7, offset: 10482 \}

GET /api/v1/topics/\{topic\}/partitions/\{partition\}/messages?offset=10480&limit=100
  -> Returns: \{ messages: [\{ offset: 10480, key: "...", value: "...", timestamp: ... \}] \}

POST /api/v1/consumer-groups/\{group\}/offsets
  -> Body: \{ offsets: [\{ topic: "orders", partition: 7, offset: 10483 \}] \}

Scaling Insight

The append-only log with sequential I/O is what makes message queues so fast. Random disk writes achieve ~100 IOPS; sequential writes achieve 500+ MB/s (equivalent to millions of small messages per second). By writing all messages sequentially to segment files and using the OS page cache for reads, a single broker can sustain 1 GB/s throughput on commodity hardware. Adding more partitions (and brokers) scales linearly.

Decision guide for when to choose Design a Message Queue and when alternative approaches are better — When to use Design a Message Queue

Key Tradeoffs

Decision	Option A	Option B	Chosen
Delivery guarantee	At-most-once (fast)	At-least-once (safe)	At-least-once default -- consumers must be idempotent; lost messages are worse than duplicates
Storage	In-memory only (fast, volatile)	Disk-backed log (durable)	Disk-backed -- sequential I/O is nearly as fast, provides retention and replay capability
Metadata coordination	ZooKeeper (external)	KRaft (built-in Raft)	KRaft -- eliminates operational dependency, simpler deployment

Practical Implementation for .NET Developers

In a .NET application, you would typically implement this pattern using the following approach:

Tradeoff analysis for Design a Message Queue listing advantages, disadvantages, and real-world considerations — Advantages and disadvantages of Design a Message Queue

ASP.NET Core setup: Create a service class that encapsulates the logic, register it with dependency injection, and inject it into your controllers or minimal API endpoints. The built-in DI container handles lifecycle management.

Entity Framework Core: For database interactions, EF Core provides the ORM layer. Use migrations for schema management and raw SQL for performance-critical queries. Consider Dapper for read-heavy paths where EF Core's overhead matters.

Azure integration: If deploying to Azure, leverage managed services — Azure Cache for Redis, Azure SQL, Azure Service Bus, Azure Cosmos DB. These eliminate operational overhead and provide built-in monitoring through Application Insights.

Production deployment examples of Design a Message Queue at companies like Netflix, Google, and Amazon — Real-world examples of Design a Message Queue

Testing: Use xUnit with Testcontainers for integration tests that spin up real databases in Docker. Mock external dependencies with NSubstitute. The WebApplicationFactory class lets you test your entire HTTP pipeline in-process.

Monitoring: Add Application Insights telemetry to track request latency, dependency calls, and custom metrics. Use structured logging with Serilog to make production debugging possible:

text

Log.Information("Processing order {OrderId} for {CustomerId}", orderId, customerId);

This gives you searchable, structured logs in Azure Monitor or Seq.

Comparison table for Design a Message Queue contrasting approaches, tradeoffs, and when to use each — Comparing key aspects of Design a Message Queue

System-Specific Clarifying Questions

Before designing Message Queue Design, ask questions specific to THIS system:

Component diagram for Design a Message Queue showing each building block and its responsibility — Key components of Design a Message Queue

Who are the primary users? Understanding the user base shapes every technical decision — consumer apps have different requirements than enterprise B2B systems.
What is the read-to-write ratio? This determines whether you optimize for fast reads (caching, denormalization) or fast writes (write-ahead logs, async processing).
What is the geographic distribution? Users in one country vs. global users fundamentally changes your data replication and CDN strategy.
What is the acceptable latency? Some features need sub-100ms responses, others can tolerate seconds. This determines your caching and architecture strategy.
What is the consistency requirement? Some data (payments, inventory) needs strong consistency. Other data (social feeds, recommendations) can be eventually consistent.

Architecture Deep Dive

The architecture for Message Queue Design should be designed around the specific access patterns of the system. Do not apply generic templates — every system has unique hotspots, bottlenecks, and scaling challenges.

Write Path: How does data enter the system? Is it bursty (event-driven, flash sales) or steady (sensor data, logs)? Bursty writes need queuing and backpressure. Steady writes can go directly to the database.

Read Path: How is data consumed? Is it fan-out (one write, many reads like social feeds) or point lookups (one read for specific data like user profiles)? Fan-out reads benefit from pre-computation and caching. Point lookups benefit from efficient indexing.

Hot Spots: Where are the bottlenecks? For Message Queue Design, identify the component that will fail first under load and design mitigation strategies: caching, sharding, rate limiting, or async processing.

Sources

Reference

Reference Solutionvideo