Ask an engineer to draw a Kafka consumer and you'll usually get two boxes and an arrow: a client on the left, Kafka on the right, poll() in the middle pulling records. One connection, one socket, done.
The reality is that a consumer calling poll() is running a small distributed protocol against your cluster:
- it discovers brokers
- finds coordinators
- joins a group
- fetches from several leaders at once
- heartbeats on a timer
- commits offsets
- re-authenticates every new connection it opens along the way
- ...
By the time it reaches steady state, it holds many open TCP connections to different brokers, and only Kafka decides which is which.
We focus on mainstream clients (Java and
librdkafkafamilies). The newer protocols change a bit the picture (KIP-848 for consumer groups, KIP-890 for transactions) but the overall shape is more-or-less the same.
One consumer, many brokers
Each of those connections goes to a different broker doing a different job: some are partition leaders for the topics you read, one is your group's coordinator, one might be a transaction coordinator.
This complexity raises a lot of questions:
- How does the client get here?
- What keeps these connections alive?
- Why is this fan-out worth understanding before you put a proxy, a mesh, or a load balancer anywhere near Kafka?
Let's dig in.
Bootstrap is discovery, not traffic
When you set up a Kafka client, you only configure one address: bootstrap.servers. It's usually a single hostname, or a few for redundancy. What matters is what that address is for, and it's not what you'd expect.
Bootstrap is discovery, not traffic. bootstrap.servers is just the address (or addresses) the client uses to make its first connection. Once that connection is up (it negotiates ApiVersions first, and authenticates if SASL is on), the client calls the Metadata API. The response carries the full broker list (each broker with its own broker.id, host, and port), plus every partition leader.
Now the client has the full map and can connect directly to whichever broker owns whatever it needs.
Said differently: one hostname in the config can become a dozen-plus broker connections (locally you can use netstat to see that).
A few warnings about bootstrap servers:
- Multiple
bootstrap.serversentries are failover for the initial contact: the client shuffles the list and picks one to reach (not strictly in config order) until one answers. They are not load-balancing for your traffic. - A load balancer in front of
bootstrap.serversis fine. Any broker can answerMetadata, so it doesn't matter which one the LB picks. - After that first discovery, periodic
Metadatarefreshes use the brokers the client already knows about. The bootstrap addresses matter again only if the client has to re-bootstrap.
🚫 "We'll just put all the brokers behind one load balancer and point clients at the VIP."
Don't do that, this does not work. After bootstrap, a client reaches each broker by the advertised host and port that broker returned in Metadata, because it needs that specific broker: the one that leads the partition it wants to read. A shared VIP that round-robins to "some broker" sends a Fetch for topicX-0 to a broker that doesn't lead topicX-0, and you get a storm of NOT_LEADER_OR_FOLLOWER errors in your applications.
Load-balance the bootstrap; never load-balance partition-leader traffic.
Coordinators are partition leaders too
A consumer doesn't only pull data from brokers. It also relies on them for some of Kafka's core mechanisms, like coordinating with the other consumers in its group, and transactions:
| Broker Role | "Leader" is leader of… | Found via | Used for |
|---|---|---|---|
| Partition leader | the topic-partition itself | Metadata | Produce, Fetch, ListOffsets |
| Group coordinator | the __consumer_offsets partition for your group.id | FindCoordinator(GROUP) | JoinGroup, OffsetCommit, Heartbeat |
| Txn coordinator | the __transaction_state partition for your transactional.id | FindCoordinator(TRANSACTION) | InitProducerId, EndTxn |
How does a client find out when this changes? It refreshes its metadata periodically, or it simply hits an error and re-discovers on the fly.
The consumer startup sequence
If you combine discovery and group membership together, you get this sequence:
Every time the client contacts a new broker, it starts over: ApiVersions, the connection handshake, and authentication. These connections are sticky for as long as the role lasts.
Note that KIP-848 collapses the classic two round-trips (JoinGroup/SyncGroup) into a single ConsumerGroupHeartbeat.
Connections are sticky
Once it's running, a consumer keeps several connections open, each pinned to a broker role, and refreshes its metadata regularly (every 5 minutes by default). A connection stays open as long as its role is in use; an idle one is closed after connections.max.idle.ms (9 minutes by default). You typically have:
- The group coordinator connection to handle heartbeats, offset commits, and group changes.
- One connection per assigned partition leader to handle fetches (the actual consuming).
- A transaction coordinator connection if the producer is transactional.
If you have a consumer reading a 12-partition topic whose leaders sit on 12 different brokers, it can have up to 13 TCP connections opened.
It stays one connection per broker no matter how you tune the client.
max.in.flight.requests.per.connection(5 by default) is how many unacknowledged requests the client will pipeline on that single connection before it waits, not a number of connections. A producer writing to those 12 leaders still opens 12 connections, each carrying up to 5 in-flight requests, not 60.
Three jobs, three cadences
Inside the client, three jobs run on their own cadences. In the classic consumer only the heartbeat has its own background thread; fetches and offset commits ride on your poll() calls:
| Loop | RPC | Cadence | Target |
|---|---|---|---|
| Heartbeat | Heartbeat / ConsumerGroupHeartbeat | every 3s (heartbeat.interval.ms); server-set ~5s (KIP-848) | Group coordinator |
| Fetch | Fetch | continuous, per partition | Partition leaders |
| OffsetCommit | OffsetCommit | auto-commit interval, or explicit | Group coordinator |
Three jobs, three cadences, two different brokers.
Each one can fail:
- Miss heartbeats and the coordinator evicts you from the group, triggering an instant rebalance.
- If a commit fails, you will reprocess from the last good offset on restart.
- If you fetch too slowly, you fall behind: lag climbs, even while everything looks healthy.
Transactions add another coordinator
When you turn on transactions, Kafka adds a fourth role: the transaction coordinator. Its job is to track transactions as they're started, committed, or rolled back across the cluster:
That's a lot of coordination just to guarantee the I in ACID.
The diagram shows transaction protocol v1. The producer registers each partition with the transaction coordinator (AddPartitionsToTxn) before it writes there, so the coordinator knows where to place markers at commit time. Kafka 4.x's v2 (KIP-890) drops both client-side calls: the broker verifies and registers partitions itself on Produce, there's no separate AddOffsetsToTxn, and TxnOffsetCommit goes straight to the group coordinator. Fewer round-trips, same guarantees.
WriteTxnMarkersis intra-cluster. OnEndTxn, the transaction coordinator writes the commit to its own log and can acknowledge the producer right away, then asynchronously sends the markers to every involved partition leader, including the__consumer_offsetsleader, which tells the group coordinator to surface the committed offsets.
Error codes and recovery
Because every "leader" can move, clients spend a lot of their lives reacting to errors that mean go find the new owner. Each error code is really a state-machine transition. Here's what each one means:
| Error | What it means | Recovery |
|---|---|---|
NOT_LEADER_OR_FOLLOWER | The partition leader moved to another broker. | Refresh Metadata, reconnect to the new leader. |
LEADER_NOT_AVAILABLE | No leader yet; an election is in progress. | Back off and retry. |
NOT_COORDINATOR | The group or transaction coordinator moved. | Re-run FindCoordinator. |
COORDINATOR_LOAD_IN_PROGRESS | The coordinator is still loading its state. | Back off and retry. Do not re-discover. |
ILLEGAL_GENERATION | A rebalance happened (classic protocol). | Re-join the group. |
STALE_MEMBER_EPOCH | The request's member epoch doesn't match the coordinator's (KIP-848). | Retry once the next ConsumerGroupHeartbeat returns the updated epoch. |
UNKNOWN_MEMBER_ID | The coordinator forgot this member. | Re-join the group. |
FENCED_INSTANCE_ID | Two instances share one group.instance.id. | Application error; cannot auto-recover. |
OFFSET_OUT_OF_RANGE | The fetch position is before or past the log. | Apply auto.offset.reset, or an app handler. |
TOPIC_AUTHORIZATION_FAILED | An ACL changed. | Surface to the application; no auto-recovery. |
Inside one poll() call
When your code calls poll(100ms) and gets a small batch of records back, here's what the client may actually have done in that window:
- Update coordinator state: run any pending rebalance and refresh the assignment
- Send a queued
OffsetCommitif auto-commit is due - If records are already buffered from a previous fetch, return them right away
- Otherwise send
Fetchto each assigned partition leader and poll the network, up to the timeout - Refresh
Metadataif it's stale or a recent error asked for it - Decode the batches into records and hand them to the caller
The heartbeat isn't in this list: in the classic consumer it runs on its own background thread, not inside poll().
The fan-out is a control plane
Kafka is not just about Produce and Consume.
It's about discovery, coordination, group membership, transactions, and error-driven re-routing. That's where the real complexity lives. It stays invisible to users, but Kafka, and anything standing in for it, has to handle all of it.
Whatever sits between a client and Kafka (a proxy, a service mesh, a load balancer) has to understand which connection is which, because each one terminates at a specific broker for a specific reason. Send a Fetch to the wrong broker and you get NOT_LEADER_OR_FOLLOWER failures, or rebalances that never settle.
This is the world a Kafka proxy lives in. Conduktor Gateway sits in exactly this position: it speaks the full client protocol, tracks which connection is which, and routes each accordingly.
