Publish-Subscribe Pattern
Pub/Sub is the foundation of event-driven architectures. It enables microservices to communicate asynchronously, decouples producers from consumers, and.
The Problem Publish-Subscribe Pattern Solves
Pub/Sub is the foundation of event-driven architectures. It enables microservices to communicate asynchronously, decouples producers from consumers, and handles traffic spikes by buffering messages.
How It Works Under the Hood
Pub/Sub (Publish-Subscribe) is a messaging pattern where publishers send messages to topics without knowing who will receive them, and subscribers receive messages from topics they are interested in. The publisher and subscriber are completely decoupled — they do not need to know about each other.
A publisher sends a message to a topic (e.g., 'order-placed'). The message broker (Kafka, RabbitMQ, Google Pub/Sub) stores the message and delivers it to all subscribers of that topic. Subscribers process messages asynchronously at their own pace.
This decouples the ordering service (publisher) from the payment service, inventory service, and notification service (subscribers). If the notification service is down, messages queue up and are processed when it recovers.
The Mental Model
- Topics: Named channels for messages. Publishers write to topics; subscribers read from topics.
- Decoupling: Publishers do not know about subscribers, and vice versa. You can add new subscribers without changing publishers.
- Fan-out: One message can be delivered to many subscribers simultaneously.
- At-least-once delivery: Most Pub/Sub systems guarantee each message is delivered at least once. Consumers must be idempotent.
- Ordering: Some systems guarantee ordering within a partition/topic (Kafka). Others do not (SNS).
Real Systems That Depend on This
Google Cloud Pub/Sub handles billions of messages per day for services like Gmail, YouTube, and Google Maps.
Apache Kafka is the most popular Pub/Sub system, used by LinkedIn, Netflix, and Uber for real-time event streaming.
AWS SNS + SQS is a common pattern: SNS (Pub/Sub) fans out to SQS queues (per subscriber), providing reliable async processing.
Where This Shows Up in Interviews
- What is Pub/Sub and how does it differ from message queues?
- When would you use Pub/Sub vs direct API calls?
- How do you ensure messages are not lost?
- What is the difference between fan-out and fan-in?
Tradeoffs
- Decoupling vs Debugging: Pub/Sub makes it hard to trace a request end-to-end.
- At-least-once vs Exactly-once: At-least-once is simpler; exactly-once requires idempotency or deduplication.
- Ordering vs Throughput: Strict ordering limits parallelism. Most systems order within a partition.
Watch Out For
- Not making consumers idempotent — duplicate messages cause duplicate processing
- Not monitoring consumer lag — a slow consumer falls behind and never catches up
- Using Pub/Sub for synchronous request-response — use APIs instead
Go Deeper
- message-queues — start here if this is new to you
- event-driven
- change-data-capture
- sync-vs-async
The Real-World Incident That Made This Famous
Google Cloud Pub/Sub had a notable outage in June 2019 that affected Gmail, YouTube, Google Drive, and Snapchat simultaneously. The root cause was a configuration change in Google's internal Pub/Sub infrastructure that reduced capacity in multiple zones at once. Because so many Google services rely on Pub/Sub for asynchronous communication, a capacity reduction in the messaging layer cascaded into visible user-facing outages across seemingly unrelated products.
This incident revealed the double-edged sword of pub/sub architectures. The pattern provides excellent decoupling — publishers do not know about subscribers, and adding a new subscriber does not affect existing ones. But it also creates a hidden dependency: the messaging infrastructure itself becomes the most critical piece of the system. Every message that flows through your platform depends on the pub/sub layer being healthy.
Twitter's early architecture is another instructive example. When a user with 30 million followers (say, Barack Obama) tweets, that tweet needs to appear in 30 million timelines. Twitter initially used a fan-out-on-write approach: when a tweet is published, it is immediately pushed to every follower's timeline cache. This is essentially a pub/sub fan-out. For users with millions of followers, a single tweet could generate 30 million cache writes. Twitter later switched to a hybrid approach: fan-out-on-write for normal users (under 10,000 followers) and fan-out-on-read for celebrities (fetch their tweets on demand when the timeline is loaded). This eliminated the massive write amplification problem.
How Senior Engineers Think About This
The mental model: pub/sub is a broadcast pattern where publishers shout messages into topics, and subscribers who are interested in a topic receive those messages. The key difference from a message queue is that a pub/sub message is delivered to ALL subscribers (fan-out), while a queue message is consumed by ONE consumer (competing consumers).
Senior engineers think carefully about fan-out ratios. If one published message triggers 1,000 subscriber notifications, and you publish 10,000 messages per second, your system needs to handle 10 million message deliveries per second downstream. This multiplicative effect catches many teams by surprise — they design the publisher for 10K/s throughput and forget that subscriber-side throughput is 1,000x higher.
The three delivery patterns to know are: at-most-once (fire and forget — messages may be lost if a subscriber is down), at-least-once (guaranteed delivery with possible duplicates — the standard choice), and exactly-once (guaranteed delivery without duplicates — extremely hard to implement correctly). Google Pub/Sub and AWS SNS provide at-least-once by default.
One critical decision is push vs. pull delivery. Push delivery (the pub/sub system sends messages to subscriber endpoints) is simpler but creates backpressure problems — if the subscriber is slow, messages queue up. Pull delivery (subscribers poll for new messages) gives subscribers control over their consumption rate but adds latency. Google Cloud Pub/Sub supports both modes. For real-time needs, use push. For batch processing, use pull.
Common Interview Mistakes
Mistake 1: Confusing pub/sub with message queues. In pub/sub, every subscriber gets every message. In a queue, each message goes to one consumer. Know when to use which.
Mistake 2: Not discussing ordering guarantees. Most pub/sub systems do NOT guarantee global message ordering. If order matters, you need to use ordered topics (Google Pub/Sub) or partition-level ordering (Kafka).
Mistake 3: Ignoring subscriber failure. What happens when a subscriber is down? Messages should be retained and redelivered when the subscriber recovers. Discuss acknowledgment deadlines and retry policies.
Mistake 4: Not estimating fan-out costs. If you have 1 million subscribers to a topic and publish 100 messages per second, that is 100 million deliveries per second. Always do the fan-out math.
Mistake 5: Forgetting about dead letter handling. Messages that fail delivery after N retries need to go somewhere. Always implement dead letter topics.
Production Checklist
- Set appropriate acknowledgment deadlines: long enough for processing, short enough for timely redelivery on failure
- Implement dead letter topics for messages that fail processing after N retries
- Monitor subscription backlog: growing backlog means subscribers are not keeping up with publishers
- Use message filtering at the subscription level to avoid delivering irrelevant messages to subscribers
- Implement idempotent message processing since at-least-once delivery means duplicates are possible
- Set message retention appropriate to your recovery needs: 7 days is a common default
- Monitor fan-out amplification and plan subscriber capacity accordingly
- Use push delivery for real-time needs and pull delivery for batch processing
- Implement backpressure mechanisms: rate limit publishers if subscribers consistently fall behind
- Test subscriber crash recovery: ensure messages are redelivered after a subscriber restarts
Read the original source | Content from System-Design-Overview
Practical Implementation for .NET Developers
In a .NET application, you would typically implement this pattern using the following approach:
ASP.NET Core setup: Create a service class that encapsulates the logic, register it with dependency injection, and inject it into your controllers or minimal API endpoints. The built-in DI container handles lifecycle management.
Entity Framework Core: For database interactions, EF Core provides the ORM layer. Use migrations for schema management and raw SQL for performance-critical queries. Consider Dapper for read-heavy paths where EF Core's overhead matters.
Azure integration: If deploying to Azure, leverage managed services — Azure Cache for Redis, Azure SQL, Azure Service Bus, Azure Cosmos DB. These eliminate operational overhead and provide built-in monitoring through Application Insights.
Testing: Use xUnit with Testcontainers for integration tests that spin up real databases in Docker. Mock external dependencies with NSubstitute. The WebApplicationFactory class lets you test your entire HTTP pipeline in-process.
Monitoring: Add Application Insights telemetry to track request latency, dependency calls, and custom metrics. Use structured logging with Serilog to make production debugging possible:
Log.Information("Processing order {OrderId} for {CustomerId}", orderId, customerId);
This gives you searchable, structured logs in Azure Monitor or Seq.