Publish/Subscribe Pattern (Deep Dive)
Beyond the basics: topic design, subscription filtering, at-least-once vs exactly-once delivery, and scaling pub/sub systems.
What Is Pub/Sub?
The Publish/Subscribe (pub/sub) pattern decouples message producers from consumers through an intermediary called a topic or channel. Publishers emit events without knowing who will receive them. Subscribers declare interest in specific topics and receive all matching messages. This is the foundational pattern behind systems like Google Cloud Pub/Sub, AWS SNS, Apache Kafka, and Redis Pub/Sub.
The core insight is that neither side needs to know the other exists. This enables you to add new consumers (log an audit trail, send a notification, update a cache) without modifying the publisher at all — the classic Open/Closed Principle applied to distributed systems.
Core Pub/Sub Flow
Topic Design
Topic granularity is one of the most consequential design decisions in a pub/sub system. Topics that are too broad force consumers to filter noise; topics that are too narrow proliferate schema and create operational overhead.
| Granularity | Example Topic | Pros | Cons |
|---|---|---|---|
| Coarse (entity-level) | `orders` | Simple publisher API, fewer topics to manage | Consumers must filter by event type themselves |
| Medium (event-level) | `orders.placed`, `orders.shipped` | Consumers only receive what they need | More topics; publisher must route correctly |
| Fine (instance-level) | `orders.placed.us-east-1` | Extreme selectivity | Topic explosion; harder to manage |
Recommended Naming Convention
Use dot-separated hierarchical names: `{domain}.{entity}.{event}`. For example: `commerce.order.placed`, `commerce.order.shipped`, `payments.invoice.created`. This enables prefix-based wildcard subscriptions in brokers that support them (e.g., NATS, RabbitMQ topic exchanges).
Subscription Filtering
Modern brokers support server-side filtering so consumers only receive messages matching their criteria. This is far more efficient than pulling all messages and discarding most of them client-side.
- Topic-based filtering: subscriber subscribes to a specific topic string (e.g., `orders.placed`)
- Content-based filtering: subscriber provides a predicate on message attributes (e.g., `region = 'us-east-1' AND amount > 1000`). AWS SNS and Google Cloud Pub/Sub support this.
- Wildcard subscriptions: NATS supports `orders.*` (one level) and `orders.>` (all levels beneath). RabbitMQ topic exchanges support `#` and `*`.
Delivery Guarantees
Understanding delivery semantics is critical in interviews. There are three levels, and most real-world systems default to at-least-once:
| Guarantee | Description | Risk | Used In |
|---|---|---|---|
| At-most-once | Fire and forget; message may be lost | Data loss on failure | Metrics, real-time telemetry where staleness is acceptable |
| At-least-once | Message delivered at least once; duplicates possible | Consumer must be idempotent | Most messaging systems (Kafka default, SQS, SNS, RabbitMQ) |
| Exactly-once | Delivered precisely once; no duplicates, no loss | High complexity and cost | Kafka with transactions + idempotent producers (EOS) |
Exactly-Once Is Expensive
Exactly-once delivery requires distributed transactions or two-phase commit under the hood. In practice, most architects design consumers to be idempotent and accept at-least-once delivery — this is simpler, more performant, and equally safe when done right.
Push vs Pull Delivery
Brokers either push messages to subscribers or allow subscribers to pull at their own pace. Each model has distinct trade-offs:
| Model | How It Works | Back-Pressure | Example |
|---|---|---|---|
| Push | Broker delivers immediately when a message arrives | Subscriber can be overwhelmed; needs flow control | SNS HTTP subscriptions, WebSocket push |
| Pull | Subscriber polls the broker on its own schedule | Natural back-pressure; consumer controls rate | Kafka consumers, SQS polling, Google Cloud Pub/Sub pull |
Durable vs Ephemeral Subscriptions
A durable subscription retains messages while the subscriber is offline and delivers them when it reconnects. An ephemeral (non-durable) subscription discards messages when the subscriber is absent. Kafka's consumer groups are durable by nature — offsets are stored in the broker. Redis Pub/Sub is ephemeral: offline subscribers miss messages.
Scaling Pub/Sub Systems
When a single consumer cannot keep up with a topic's message rate, you can scale out with consumer groups (Kafka) or competing consumers on a shared queue behind a subscription. The key distinction: all consumers in a Kafka consumer group share the partitions — each partition is consumed by exactly one group member at a time. This gives you parallelism proportional to partition count.
For SNS/SQS, the standard pattern is: SNS topic fans out to multiple SQS queues, each owned by a different service. Within each service, multiple EC2 or Lambda consumers compete for messages on that SQS queue. This is the canonical AWS fan-out architecture.
Interview Tip
When asked to design a notification system or an event-driven microservice, lead with pub/sub and immediately cover: (1) topic naming, (2) delivery guarantee (at-least-once + idempotent consumers), (3) how consumers scale (consumer groups / competing consumers), and (4) what happens when a subscriber is down (durable queue vs DLQ). This covers the four questions interviewers care most about.