Competing Consumers Pattern
Scale message processing horizontally: multiple consumers on a single queue, message visibility, ordering guarantees, and partition-based parallelism.
What Are Competing Consumers?
The Competing Consumers pattern runs multiple consumer instances on a single queue. Each consumer independently polls for messages; the queue ensures each message is delivered to exactly one consumer (at-least-once delivery with idempotent processing covering the 'at most once' gap). Consumers compete for work — the fastest consumer processes the most messages, making the pool self-load-balancing.
This pattern enables horizontal scaling of message processing: doubling the number of consumers approximately doubles throughput (up to the queue's throughput limit). It's the dominant pattern for task queues, job processors, and async worker pools.
Message Visibility and Acknowledgment
When a consumer receives a message, the queue makes it invisible to other consumers for a configurable visibility timeout (e.g., 30 s in Amazon SQS). If the consumer processes and acknowledges the message within that window, it is deleted from the queue. If the consumer crashes or times out before acknowledging, the message becomes visible again and another consumer picks it up.
Set Visibility Timeout Longer Than Your Processing Time
If processing takes 25 s and your visibility timeout is 20 s, the message becomes visible mid-processing, causing duplicate processing. Set the visibility timeout to 3–5× your expected processing time, or extend it programmatically as the job progresses.
Ordering Guarantees
Standard queues (SQS Standard, most RabbitMQ configurations) do not guarantee message ordering when multiple consumers are running — a faster consumer may process a later message before a slower consumer finishes an earlier one. If ordering matters (e.g., user events must be processed in sequence), use one of these approaches:
- Single consumer — only one consumer processes the queue (eliminates parallelism)
- FIFO queues — SQS FIFO guarantees ordering within a message group ID (but caps throughput at 3,000 msg/s)
- Partitioned topics — Kafka partitions messages by key; all messages with the same key go to the same partition, processed by one consumer group member in order
- Application-level sequencing — include a sequence number and have the consumer handle out-of-order processing with a buffer
Partition-Based Parallelism (Kafka Consumer Groups)
Apache Kafka uses a different competing consumers model: partitions, not individual messages, are assigned to consumers. Each partition is consumed by exactly one consumer in a consumer group at a time. This guarantees ordering within a partition while enabling parallelism across partitions. The number of active consumers is capped at the number of partitions — adding more consumers than partitions results in idle consumers.
Idempotent Consumers
Because queues guarantee at-least-once delivery (not exactly-once), a message may be processed more than once (e.g., consumer crashes after processing but before acknowledging). Consumers must be idempotent — processing the same message twice produces the same result as processing it once. Techniques: track processed message IDs in a database, use database upsert instead of insert, or make the operation inherently idempotent (setting a value is idempotent; incrementing is not).
Interview Tip
In interviews, the competing consumers pattern comes up when you need to scale async workers. Key points to mention: visibility timeout calibration, idempotency for at-least-once delivery, ordering trade-offs (standard vs FIFO vs Kafka partitions), and auto-scaling consumers based on queue depth. Also mention dead letter queues — messages that repeatedly fail should be moved to a DLQ for investigation rather than blocking the main queue.
Auto-Scaling Consumers
Consumer count should scale with queue depth. Common approach: use a CloudWatch alarm (AWS) or a custom metric to trigger Auto Scaling Group scale-out when queue depth exceeds a threshold (e.g., depth > 1000), and scale-in when depth drops to zero. Target-tracking scaling with a custom metric (queue depth / consumer count) maintains a target messages-per-consumer ratio, providing smooth scaling without oscillation.