Scaling, Competing Consumers, Channels & Connections — Full Recall

How A New Instance Registers¶

When your BackgroundService starts it calls BasicConsume. That single call is what registers the consumer with the broker. No config, no setup, no coordination needed.

Instance 1 starts → BasicConsume → broker list: [instance1]
Instance 2 starts → BasicConsume → broker list: [instance1, instance2]
Instance 3 starts → BasicConsume → broker list: [instance1, instance2, instance3]

Instance dies → broker detects TCP connection dropped → removes from list automatically. Nothing to clean up manually.

What Competing Consumers Actually Means¶

Same queue, multiple consumers. Each message goes to exactly one — not all. The broker round robins across registered consumers.

message 1 → instance 1
message 2 → instance 2
message 3 → instance 3
message 4 → instance 1  ← wraps around

This is different from multiple queues bound to the same exchange — that fans out a copy to each queue. Competing consumers on one queue means load balancing — one receiver per message.

Multiple queues same binding  →  every queue gets a COPY  (pub/sub)
Multiple consumers same queue →  one consumer gets it     (load balancing)

The Prefetch Problem Without Configuration¶

Without prefetch, broker dispatches messages to consumers as fast as possible without waiting for ACKs. It does not wait to see if a consumer is busy.

10 messages, 3 consumers, no prefetch:

Broker fires immediately:
instance 1 buffer: [msg1, msg2, msg3, msg4, msg5, msg6, msg7]
instance 2 buffer: [msg8]
instance 3 buffer: [msg9, msg10]

Why does this happen? Because the broker has no limit on unacked messages per consumer. It just keeps sending. Fast consumers or consumers that connect first get everything. Others sit idle.

Prefetch Count — What It Actually Is¶

Maximum number of unacked messages a consumer can hold at one time. Broker stops sending when limit is reached. Only resumes after an ACK comes back.

prefetchCount: 1

Broker sends msg1 to instance1  — instance1 has 1 unacked, at limit
Broker sends msg2 to instance2  — instance2 has 1 unacked, at limit
Broker sends msg3 to instance3  — instance3 has 1 unacked, at limit

msg4-msg10 stay in queue — all consumers at limit

instance2 ACKs → instance2 drops to 0 unacked → broker sends msg4 to instance2
instance3 ACKs → sends msg5 to instance3
instance1 ACKs → sends msg6 to instance1

Fair dispatch. Whoever finishes first gets the next message.

QoS — Same Thing As Prefetch¶

QoS stands for Quality of Service. BasicQos is just the method name. It IS prefetch configuration. Not a queue setting — a consumer side instruction to the broker set on the channel.

_channel.BasicQos(
    prefetchSize:  0,      // max bytes — always 0, unlimited
    prefetchCount: 1,      // max unacked messages — this is what matters
    global:        false   // per consumer not shared pool
);

Set once on the channel immediately after creating it, before consuming.

global: false vs global: true¶

global: false  →  prefetch limit per consumer independently
global: true   →  prefetch limit shared pool across all consumers on channel

With global: false and prefetchCount: 1 — each consumer on the channel gets 1 message at a time independently.

With global: true and prefetchCount: 3 — all consumers on the channel share a pool of 3 total unacked messages. One consumer could take all 3, leaving others idle.

Always global: false. Per consumer fairness is what you want.

In your abstraction each consumer gets its own channel anyway — so global flag makes no practical difference. But global: false is correct convention.

Ordering Trade-off¶

Competing consumers break message ordering:

msg1 → instance1 (500ms processing)
msg2 → instance2 (10ms processing)
msg3 → instance3 (50ms processing)

Processed order: 2, 3, 1  ← not publish order

If ordering matters — single consumer, single instance, prefetchCount: 1. Sequential processing. You sacrifice throughput for order.

For most systems ordering within a message type is not required — each message is independent.

What A Channel Is¶

A virtual connection multiplexed inside one TCP connection. Cheap to create — just a number assigned by the broker, microseconds. No TCP overhead.

TCP Connection (one, expensive, persistent)
  ├── Channel 1  ← consumer for queue A
  ├── Channel 2  ← consumer for queue B
  ├── Channel 3  ← publisher
  └── Channel 4  ← publisher

All channels share one TCP connection. From outside — one connection to broker. Inside — many independent virtual connections.

Analogy: TCP connection = motorway. Channels = lanes. Opening a lane is cheap, you don't build a new motorway.

Why Channel Is Not Thread Safe¶

AMQP sends data in frames. Two threads writing to same channel simultaneously interleave frames. Broker receives garbled data. Connection error or silent data corruption.

Thread 1 writing frame for message A ──┐
Thread 2 writing frame for message B ──┤  interleaved frames
                                        ▼
                               broker receives invalid AMQP

One channel per thread. One channel per consumer. Never shared.

HTTP vs AMQP Connection Model¶

This is a fundamental difference:

HTTP:
  Request 1 → open TCP → send → receive → close
  Request 2 → open TCP → send → receive → close
  Request 3 → open TCP → send → receive → close

AMQP:
  App starts → open TCP → stays open forever
  Publish msg1 → over existing connection
  Publish msg2 → over existing connection
  Consumer receives → over existing connection
  ACK → over existing connection
  ...days later...
  Publish msg999 → same connection

RabbitMQ needs persistent connection because broker pushes messages to consumers the moment they arrive. If connection was per-message — 100ms to establish TCP every time = defeats purpose of message broker.

Connection vs Channel In Code¶

Connection  →  expensive TCP, one per app, singleton, safe to share
Channel     →  cheap virtual, one per consumer, not injectable, never shared

// Connection — DI singleton
services.AddSingleton<RabbitMqConnection>();

// Channel — created locally inside each consumer
public abstract class RabbitMqConsumer<T> : BackgroundService
{
    private IModel? _channel;  // owned by this consumer only

    protected override Task ExecuteAsync(CancellationToken ct)
    {
        _channel = _connection.CreateChannel();  // new channel, not from DI
        _channel.BasicQos(0, 1, false);
        _channel.BasicConsume(...);
    }
}

Channel is never injected via DI because it is not thread safe and has a specific lifecycle tied to the consumer that owns it.

What Happens When Consumer Crashes Mid-Processing¶

Instance 2 receives message
Instance 2 starts processing
Instance 2 crashes — no ACK sent

Broker sees: message delivered, no ACK, consumer connection gone
Broker requeues message automatically
Message delivered to instance 1 or instance 3
Nothing lost

This is exactly why autoAck: false matters. With autoAck: true broker already deleted the message on delivery — crash = message gone forever.

What Happens When Connection Drops¶

Network blip or broker restart
  → TCP connection dies
  → all channels on it die
  → all consumers stop receiving
  → all unacked messages requeued by broker
  → your app needs to reconnect and re-register consumers

Raw client — you handle reconnection yourself. MassTransit and Rebus handle it automatically with exponential backoff, channel recovery, consumer re-registration.

Why Raw Client Is Unwise In Production¶

Managing a persistent TCP connection correctly means handling:

Reconnection logic          →  detect drop, reconnect with backoff
Channel recovery            →  recreate all channels after reconnect
Consumer re-registration    →  BasicConsume again on every queue
Publisher confirm tracking  →  resend unconfirmed messages after reconnect
Heartbeat configuration     →  AMQP heartbeat to detect dead connections
Graceful shutdown           →  drain in-flight, ACK pending, then close
Thread safety enforcement   →  ensure channels never accidentally shared

MassTransit and Rebus handle all of this — tested, production proven. Raw client is for learning what is actually happening under the hood, which is exactly what you've been doing.

Kubernetes Scaling¶

Queue depth is a natural backpressure signal. KEDA reads queue depth from RabbitMQ management API and drives Kubernetes HPA:

queue depth 0-10   →  1 pod
queue depth 10-50  →  3 pods
queue depth 50+    →  max pods

No consumer code changes. Just run more instances. Broker distributes automatically via competing consumers.