Message Queues vs Event Streams: Picking the Right Messaging Backbone

February 10, 2024·14 min read·by Bishwambhar Sen

Two messaging pipeline diagrams side by side showing queue consumption vs event stream replay

When teams are building their first async communication layer, the decision between a message queue and an event stream often gets resolved by who on the team has used which technology before, or by whatever the current job market is rewarding. Neither is a sound basis for an architectural decision that will shape how your services communicate for the next five years.

The distinction is not just operational. The two models encode different semantics about message ownership, consumption, retention, and replay — semantics that ripple outward to affect how you model your domain, how you scale consumers, and how you audit and recover from failures. This post makes those semantics explicit.

Concept

Message Queues: Work Distribution with Ownership Transfer

A message queue implements a destructive consumption model. When a consumer reads a message and acknowledges it, the message is gone. The queue's contract is: each message is processed by exactly one consumer, and once processed, it ceases to exist in the queue.

RabbitMQ is the canonical implementation of this model. It uses AMQP, a binary protocol designed around the notion of exchanges, bindings, and queues. Producers publish to an exchange, which routes to one or more queues based on routing keys and binding rules. Consumers subscribe to queues and receive messages in a round-robin distribution across competing consumers.

The queue model is an excellent fit for work distribution: tasks that need to be executed exactly once by one worker. Background job processing, email sending, payment webhooks, PDF generation — workloads where you want horizontal scaling across consumers and where the consumer pool collectively owns the responsibility for processing.

The critical property is backpressure. When consumers are slow, messages accumulate in the queue. When consumers are faster than producers, the queue stays empty. This acts as a natural buffer between producer throughput and consumer capacity. RabbitMQ also supports publisher confirms and consumer acknowledgments, giving you durable, at-least-once delivery with explicit retry control.

Event Streams: Immutable Log with Consumer-Owned Offsets

Kafka implements the log model. Messages (events) are written to a partitioned, ordered, immutable log. Consumers maintain their own read position (offset) in the log. Reading an event does not remove it. Different consumer groups can independently read the same events from different offsets — one group might be at offset 5,000 and another at offset 1,200 of the same partition.

This is a fundamentally different ownership model. The broker does not track per-message delivery state. The consumer group tracks its own offset, either committed back to Kafka or managed externally. Events are retained for a configured window (time-based or size-based), not until acknowledgment.

The stream model is an excellent fit for event sourcing, audit logs, event-driven projections, and stream processing: workloads where multiple independent consumers need to react to the same events, where you need to replay history to rebuild state, or where the event log itself is your source of truth.

Ordering Guarantees

RabbitMQ guarantees message ordering within a queue: messages are delivered to consumers in the order they were enqueued. However, when multiple consumers compete on the same queue, the processing order is non-deterministic — two messages can be processed concurrently by different consumers, and the second may finish before the first.

Kafka guarantees ordering within a partition. A topic is divided into partitions (typically by a partition key, such as an entity ID). All events for the same key land in the same partition and are consumed in strict order by a single consumer within a consumer group. Ordering across partitions is not guaranteed.

This distinction is critical for event-driven state reconstruction. If you're building a CQRS read model from an event stream, you need all events for a given aggregate to arrive in order. Kafka's partition key approach makes this straightforward: partition by aggregate ID. With RabbitMQ, you'd need per-aggregate queues or a sequencing mechanism, which scales poorly.

Consumer Groups and Fan-Out

In Kafka, a consumer group is a logical subscriber. Each partition is consumed by exactly one member of the group at a time, but multiple groups can each independently consume all events. This is built-in fan-out at zero additional cost.

In RabbitMQ, fan-out requires explicit topology: you bind the same exchange to multiple queues, one per consumer type. Each queue independently accumulates copies of the message. This is flexible but requires upfront knowledge of all consumers; a new consumer type added after the fact will not see historical messages.

This is the decisive difference for event-driven architectures with multiple downstream services: Kafka's consumer group model lets you add new consumers at any time without changing producer code or message duplication. RabbitMQ fan-out requires adding the binding (and therefore accumulating queue backpressure) before the producer starts writing.

Constraints

Retention vs. deletion: Kafka retains events for a configurable window regardless of consumption. If you set a 7-day retention window on a high-throughput topic (100k events/sec), you need to plan for the disk footprint: 100k events × 86,400 sec × 7 days × average event size. At 500 bytes per event that is approximately 30 TB. Log compaction reduces this for key-value topics but adds CPU overhead.

RabbitMQ queue depth and memory: RabbitMQ keeps unacknowledged messages in memory by default. A slow consumer that falls behind can cause the broker to page messages to disk (flow control), which degrades throughput for all queues on that node. Queue depth must be bounded through dead-letter configuration or consumer SLOs.

Kafka partition count and consumer scaling: A consumer group can scale horizontally up to the number of partitions. If a topic has 12 partitions, you can have at most 12 active consumers in a group. Adding a 13th consumer leaves it idle. Partition count is set at topic creation and is difficult to change without rebalancing the entire consumer group, which causes a pause in consumption.

Exactly-once semantics: Both systems offer at-least-once delivery by default. Kafka offers idempotent producers and transactional APIs that enable exactly-once semantics within a single cluster, at significant protocol complexity. RabbitMQ's quorum queues with publisher confirms give you durable at-least-once. Exactly-once across heterogeneous systems (queue → database) requires idempotent consumers and deduplication keys regardless of the broker.

Trade-offs

Dimension	RabbitMQ (Queue)	Kafka (Stream)
Consumption model	Destructive (one consumer)	Non-destructive (any consumer, any time)
Fan-out	Topology-based, requires pre-binding	Built-in via consumer groups
Ordering	Per-queue, non-deterministic across consumers	Per-partition, strict
Replay	Not supported natively	Up to retention window
Throughput ceiling	~50k msg/sec per node	Millions/sec with partition scaling
Operational model	Broker tracks delivery state	Consumer tracks offset
Best fit	Task dispatch, work queues	Event sourcing, audit, stream processing

The rule of thumb: if the question is "who should do this work?", use a queue. If the question is "what happened, and who needs to know?", use a stream.

Code

RabbitMQ Consumer with Dead-Letter and Retry Logic

public sealed class PaymentWebhookConsumer : BackgroundService
{
    private readonly IConnection _rabbitConnection;
    private readonly IPaymentWebhookProcessor _processor;
    private readonly ILogger<PaymentWebhookConsumer> _logger;
    private const int MaxRetryAttempts = 3;

    public PaymentWebhookConsumer(
        IConnection rabbitConnection,
        IPaymentWebhookProcessor processor,
        ILogger<PaymentWebhookConsumer> logger)
    {
        _rabbitConnection = rabbitConnection;
        _processor = processor;
        _logger = logger;
    }

    protected override async Task ExecuteAsync(CancellationToken stoppingToken)
    {
        using var channel = _rabbitConnection.CreateModel();

        // Declare the dead-letter exchange first
        channel.ExchangeDeclare("payment.dlx", ExchangeType.Direct, durable: true);
        channel.QueueDeclare("payment.webhooks.dead", durable: true, exclusive: false);
        channel.QueueBind("payment.webhooks.dead", "payment.dlx", "payment.webhooks");

        // Main queue routes failed messages to DLX after MaxRetryAttempts
        channel.QueueDeclare(
            queue: "payment.webhooks",
            durable: true,
            exclusive: false,
            autoDelete: false,
            arguments: new Dictionary<string, object>
            {
                ["x-dead-letter-exchange"] = "payment.dlx",
                ["x-delivery-limit"] = MaxRetryAttempts // quorum queue feature
            });

        // Prefetch: process one message at a time per consumer for flow control
        channel.BasicQos(prefetchSize: 0, prefetchCount: 1, global: false);

        var consumer = new AsyncEventingBasicConsumer(channel);
        consumer.Received += async (_, eventArgs) =>
        {
            var body = Encoding.UTF8.GetString(eventArgs.Body.ToArray());
            try
            {
                var webhookPayload = JsonSerializer.Deserialize<PaymentWebhookPayload>(body)!;
                await _processor.ProcessAsync(webhookPayload, stoppingToken);
                channel.BasicAck(eventArgs.DeliveryTag, multiple: false);
            }
            catch (Exception ex)
            {
                _logger.LogError(ex, "Failed to process webhook. Nacking for requeue.");
                // Requeue: broker will redeliver up to x-delivery-limit times, then DLX
                channel.BasicNack(eventArgs.DeliveryTag, multiple: false, requeue: true);
            }
        };

        channel.BasicConsume("payment.webhooks", autoAck: false, consumer: consumer);
        await Task.Delay(Timeout.Infinite, stoppingToken);
    }
}

Kafka Consumer Group with Offset Management and Replay

public sealed class OrderEventProjector : BackgroundService
{
    private readonly IConsumer<string, string> _kafkaConsumer;
    private readonly IOrderReadModelRepository _readModelRepository;
    private readonly ILogger<OrderEventProjector> _logger;

    public OrderEventProjector(
        IConsumer<string, string> kafkaConsumer,
        IOrderReadModelRepository readModelRepository,
        ILogger<OrderEventProjector> logger)
    {
        _kafkaConsumer = kafkaConsumer;
        _readModelRepository = readModelRepository;
        _logger = logger;
    }

    protected override async Task ExecuteAsync(CancellationToken stoppingToken)
    {
        // Consumer group: all instances of this service share one group,
        // partitions are distributed across them
        _kafkaConsumer.Subscribe("orders.events");

        while (!stoppingToken.IsCancellationRequested)
        {
            ConsumeResult<string, string>? result = null;
            try
            {
                result = _kafkaConsumer.Consume(TimeSpan.FromMilliseconds(200));
                if (result is null) continue;

                var domainEvent = JsonSerializer.Deserialize<OrderDomainEvent>(result.Message.Value)!;

                // Apply event to read model — idempotent by event ID
                await _readModelRepository.ApplyEventAsync(domainEvent, stoppingToken);

                // Commit offset only after successful processing
                // StoreOffset (not CommitAsync) batches commits for throughput
                _kafkaConsumer.StoreOffset(result);
            }
            catch (ConsumeException ex) when (ex.Error.IsFatal)
            {
                _logger.LogCritical(ex, "Fatal Kafka consumer error. Shutting down.");
                break;
            }
            catch (Exception ex)
            {
                _logger.LogError(ex,
                    "Failed to project event at offset {Offset} partition {Partition}",
                    result?.Offset.Value, result?.Partition.Value);
                // Do NOT commit offset: the consumer will re-read this event on restart
                // Implement a poison pill handler for events that consistently fail
            }
        }

        _kafkaConsumer.Close();
    }
}

// Replay from a specific offset — useful for rebuilding a corrupted read model
public sealed class KafkaReplayService
{
    private readonly IConsumerFactory _consumerFactory;

    public KafkaReplayService(IConsumerFactory consumerFactory)
        => _consumerFactory = consumerFactory;

    public async IAsyncEnumerable<OrderDomainEvent> ReplayFromAsync(
        string topic,
        int partition,
        long fromOffset,
        [EnumeratorCancellation] CancellationToken cancellationToken)
    {
        // Isolated consumer — not part of the main consumer group, no offset commits
        using var replayConsumer = _consumerFactory.CreateIsolatedConsumer();
        var topicPartition = new TopicPartitionOffset(topic, partition, new Offset(fromOffset));
        replayConsumer.Assign(topicPartition);

        while (!cancellationToken.IsCancellationRequested)
        {
            var result = replayConsumer.Consume(TimeSpan.FromMilliseconds(500));
            if (result is null) yield break; // End of partition reached

            yield return JsonSerializer.Deserialize<OrderDomainEvent>(result.Message.Value)!;
        }
    }
}