intermediate10 min readUpdated 2026-06-08

Publish-Subscribe Pattern

Pub/Sub is the foundation of event-driven architectures. It enables microservices to communicate asynchronously, decouples producers from consumers, and.

Pub/Sub (publish-subscribe) is a messaging pattern where publishers send messages to named topics and subscribers receive messages from topics they care about. Publishers and subscribers are fully decoupled — neither knows about the other. It is the backbone of event-driven architectures, powering everything from order processing pipelines to real-time notification systems at Google, LinkedIn, and Uber.

Aspect	Details
What it is	Messaging pattern where publishers write to topics and all subscribers receive a copy
When to use	Event-driven architectures, fan-out (one event triggers many consumers), decoupled microservices
When NOT to use	Request-response workflows needing immediate replies; simple point-to-point communication
Real-world example	Google Cloud Pub/Sub powers Gmail and YouTube; LinkedIn uses Kafka for activity feeds with billions of events/day
Interview tip	Contrast with message queues — pub/sub fans out to ALL subscribers, a queue delivers to ONE consumer
Common mistake	Not making consumers idempotent — at-least-once delivery means duplicates are inevitable
Key tradeoff	Decoupling and horizontal scalability vs. debugging difficulty and eventual consistency

Pub/Sub is the foundation of event-driven architectures. It enables microservices to communicate asynchronously, decouples producers from consumers, and handles traffic spikes by buffering messages.

How It Works Under the Hood

Pub/Sub architecture: order service publishes to order-placed topic, Kafka broker fans out to payment service, inventory service, and notification service independently — System architecture for Publish-Subscribe Pattern

Pub/Sub (Publish-Subscribe) is a messaging pattern where publishers send messages to topics without knowing who will receive them, and subscribers receive messages from topics they are interested in. The publisher and subscriber are completely decoupled — they do not need to know about each other.

A publisher sends a message to a topic (e.g., 'order-placed'). The message broker (Kafka, RabbitMQ, Google Pub/Sub) stores the message and delivers it to all subscribers of that topic. Subscribers process messages asynchronously at their own pace.

This decouples the ordering service (publisher) from the payment service, inventory service, and notification service (subscribers). If the notification service is down, messages queue up and are processed when it recovers.

The Mental Model

Message flow: publisher sends event to topic, broker persists message, each subscriber receives a copy asynchronously, subscribers acknowledge after processing — How Publish-Subscribe Pattern works step by step

Topics: Named channels for messages. Publishers write to topics; subscribers read from topics.
Decoupling: Publishers do not know about subscribers, and vice versa. You can add new subscribers without changing publishers.
Fan-out: One message can be delivered to many subscribers simultaneously.
At-least-once delivery: Most Pub/Sub systems guarantee each message is delivered at least once. Consumers must be idempotent.
Ordering: Some systems guarantee ordering within a partition/topic (Kafka). Others do not (SNS).

Real Systems That Depend on This

Google Cloud Pub/Sub handles billions of messages per day for services like Gmail, YouTube, and Google Maps.

Apache Kafka is the most popular Pub/Sub system, used by LinkedIn, Netflix, and Uber for real-time event streaming.

AWS SNS + SQS is a common pattern: SNS (Pub/Sub) fans out to SQS queues (per subscriber), providing reliable async processing.

Where This Shows Up in Interviews

What is Pub/Sub and how does it differ from message queues?
When would you use Pub/Sub vs direct API calls?
How do you ensure messages are not lost?
What is the difference between fan-out and fan-in?

Tradeoffs

Event published to topic, broker writes to partition log, consumer group A reads at its own offset, consumer group B reads independently at its offset — Data flow through Publish-Subscribe Pattern

Decoupling vs Debugging: Pub/Sub makes it hard to trace a request end-to-end.
At-least-once vs Exactly-once: At-least-once is simpler; exactly-once requires idempotency or deduplication.
Ordering vs Throughput: Strict ordering limits parallelism. Most systems order within a partition.

Watch Out For

Not making consumers idempotent — duplicate messages cause duplicate processing
Not monitoring consumer lag — a slow consumer falls behind and never catches up
Using Pub/Sub for synchronous request-response — use APIs instead

How to Explain This in an Interview

Here is how I would explain Publish-Subscribe Pattern in a system design interview:

Pub/Sub decouples producers from consumers through topics. A publisher writes an event to a topic — say 'order-placed'. Every service subscribed to that topic gets a copy: payment service charges the card, inventory service reserves stock, notification service sends a confirmation email. None of these services know about each other. The key difference from a message queue that interviewers test: in a queue, each message goes to ONE consumer (competing consumers pattern). In pub/sub, each message fans out to ALL subscribers. The critical implementation detail: consumers must be idempotent because most systems guarantee at-least-once delivery, meaning a consumer may process the same message twice during retries or rebalances.

Go Deeper

Message Queues — start here if this is new to you
Event-Driven vs Request-Driven
Change Data Capture
Sync vs Async

The Real-World Incident That Made This Famous

Google Cloud Pub/Sub had a notable outage in June 2019 that affected Gmail, YouTube, Google Drive, and Snapchat simultaneously. The root cause was a configuration change in Google's internal Pub/Sub infrastructure that reduced capacity in multiple zones at once. Because so many Google services rely on Pub/Sub for asynchronous communication, a capacity reduction in the messaging layer cascaded into visible user-facing outages across seemingly unrelated products.

This incident revealed the double-edged sword of pub/sub architectures. The pattern provides excellent decoupling — publishers do not know about subscribers, and adding a new subscriber does not affect existing ones. But it also creates a hidden dependency: the messaging infrastructure itself becomes the most critical piece of the system. Every message that flows through your platform depends on the pub/sub layer being healthy.

Twitter's early architecture is another instructive example. When a user with 30 million followers (say, Barack Obama) tweets, that tweet needs to appear in 30 million timelines. Twitter initially used a fan-out-on-write approach: when a tweet is published, it is immediately pushed to every follower's timeline cache. This is essentially a pub/sub fan-out. For users with millions of followers, a single tweet could generate 30 million cache writes. Twitter later switched to a hybrid approach: fan-out-on-write for normal users (under 10,000 followers) and fan-out-on-read for celebrities (fetch their tweets on demand when the timeline is loaded). This eliminated the massive write amplification problem.

How Senior Engineers Think About This

The mental model: pub/sub is a broadcast pattern where publishers shout messages into topics, and subscribers who are interested in a topic receive those messages. The key difference from a message queue is that a pub/sub message is delivered to ALL subscribers (fan-out), while a queue message is consumed by ONE consumer (competing consumers).

Senior engineers think carefully about fan-out ratios. If one published message triggers 1,000 subscriber notifications, and you publish 10,000 messages per second, your system needs to handle 10 million message deliveries per second downstream. This multiplicative effect catches many teams by surprise — they design the publisher for 10K/s throughput and forget that subscriber-side throughput is 1,000x higher.

The three delivery patterns to know are: at-most-once (fire and forget — messages may be lost if a subscriber is down), at-least-once (guaranteed delivery with possible duplicates — the standard choice), and exactly-once (guaranteed delivery without duplicates — extremely hard to implement correctly). Google Pub/Sub and AWS SNS provide at-least-once by default.

One critical decision is push vs. pull delivery. Push delivery (the pub/sub system sends messages to subscriber endpoints) is simpler but creates backpressure problems — if the subscriber is slow, messages queue up. Pull delivery (subscribers poll for new messages) gives subscribers control over their consumption rate but adds latency. Google Cloud Pub/Sub supports both modes. For real-time needs, use push. For batch processing, use pull.

Common Interview Mistakes

Mistake 1: Confusing pub/sub with message queues. In pub/sub, every subscriber gets every message. In a queue, each message goes to one consumer. Know when to use which.

Mistake 2: Not discussing ordering guarantees. Most pub/sub systems do NOT guarantee global message ordering. If order matters, you need to use ordered topics (Google Pub/Sub) or partition-level ordering (Kafka).

Mistake 3: Ignoring subscriber failure. What happens when a subscriber is down? Messages should be retained and redelivered when the subscriber recovers. Discuss acknowledgment deadlines and retry policies.

Mistake 4: Not estimating fan-out costs. If you have 1 million subscribers to a topic and publish 100 messages per second, that is 100 million deliveries per second. Always do the fan-out math.

Mistake 5: Forgetting about dead letter handling. Messages that fail delivery after N retries need to go somewhere. Always implement dead letter topics.

Production Checklist

Set appropriate acknowledgment deadlines: long enough for processing, short enough for timely redelivery on failure
Implement dead letter topics for messages that fail processing after N retries
Monitor subscription backlog: growing backlog means subscribers are not keeping up with publishers
Use message filtering at the subscription level to avoid delivering irrelevant messages to subscribers
Implement idempotent message processing since at-least-once delivery means duplicates are possible
Set message retention appropriate to your recovery needs: 7 days is a common default
Monitor fan-out amplification and plan subscriber capacity accordingly
Use push delivery for real-time needs and pull delivery for batch processing
Implement backpressure mechanisms: rate limit publishers if subscribers consistently fall behind
Test subscriber crash recovery: ensure messages are redelivered after a subscriber restarts

Read the original source | Content from System-Design-Overview

Practical Implementation for .NET Developers

In a .NET application, you would typically implement this pattern using the following approach:

ASP.NET Core setup: Create a service class that encapsulates the logic, register it with dependency injection, and inject it into your controllers or minimal API endpoints. The built-in DI container handles lifecycle management.

Entity Framework Core: For database interactions, EF Core provides the ORM layer. Use migrations for schema management and raw SQL for performance-critical queries. Consider Dapper for read-heavy paths where EF Core's overhead matters.

Azure integration: If deploying to Azure, leverage managed services — Azure Cache for Redis, Azure SQL, Azure Service Bus, Azure Cosmos DB. These eliminate operational overhead and provide built-in monitoring through Application Insights.

Testing: Use xUnit with Testcontainers for integration tests that spin up real databases in Docker. Mock external dependencies with NSubstitute. The WebApplicationFactory class lets you test your entire HTTP pipeline in-process.

Monitoring: Add Application Insights telemetry to track request latency, dependency calls, and custom metrics. Use structured logging with Serilog to make production debugging possible:

text

Log.Information("Processing order {OrderId} for {CustomerId}", orderId, customerId);

This gives you searchable, structured logs in Azure Monitor or Seq.

External Resources

Original Sourcearticle

The Problem Publish-Subscribe Pattern Solves