The Surprising Cost of Kafka Partition Waste

Chuck Larrieu Casias June 23, 2026 7 min read
Isometric wireframe stack of partition shards on a dark teal background; most slats sit empty and dim while two glow lime as data-packet cubes stream into them
The 60-second version
  • Most Kafka clusters run 40 to 70 percent partition waste, even by a deliberately conservative definition.
  • On managed Kafka the cost is obvious: per-partition-hour billing and hard per-CKU limits turn waste into six figures a year that dominates throughput and storage.
  • On self-managed Kafka the cost is hidden but real: partition replicas, not throughput, often set how many brokers you run.
  • KRaft did not make partitions free. It fixed cluster-level metadata limits, not per-broker file descriptors, memory buffers, random I/O, or replication request overhead.
  • Conduktor addresses this in four steps: Self-Service, Chargeback, virtual cluster consolidation, and topic concentration.

Partitions are the unit of parallelism in Kafka, and the design of the system naturally encourages its users to overpartition.

Little by little, the number of partitions tends to dominate the cost profile of Kafka. When we analyze customer clusters, we typically find that 40 to 70 percent of their infrastructure cost is due to partition waste, even with a rather conservative definition of waste.

How we define waste (conservatively)

We call a partition wasteful only when it clears a very low bar. For any topic whose producer throughput is under 1 MB/s, the partition count should be no more than max(members in consumer group) across all consumer groups reading it. You need one partition per active consumer to keep everyone busy; beyond that, on a low-throughput topic, you are paying overhead for nothing.

In most cases, topics with less than 1 MB/s throughput should have only 1 partition. Heck, topics with less than 10 MB/s could get away with 1 partition most of the time. And these are the vast majority of Kafka topics.

None of this means partitions are the enemy. If a topic is genuinely throughput-bound, or you need real headroom for consumers you'll actually add, partition up, on purpose and measured. The waste is the partitions nobody chose: copy-pasted defaults, "just in case" over-provisioning, a template that starts everything at 30.

As an aside: If teams are clamoring for dozens of partitions because their consumer applications are slow and need parallelism, it's likely because they are IO bound (e.g. making several calls to external systems for each record processed). These applications probably need fewer consumers than they think.

A low-throughput topic shown as partition ticks: 50 partitions versus a right-sized 3, both feeding the same 3 consumers at the same 1 MB/s, so the 47 extra partitions are overhead, not parallelism

The easy case: managed Kafka

If you run on a managed provider, the bill makes the argument for you. Providers commonly charge per partition-hour and enforce hard partition ceilings. Confluent Cloud, for example, allows 4,500 partitions per CKU. So partition waste may have you paying for a cluster capable of 240 MB/s produce and 720 MB/s consume when your actual total throughput is close to 10 MB/s.

Run the numbers on a large cluster and even our conservative waste figure lands in the hundreds of thousands of dollars per year.

"But we self-manage, so why should we care?"

This is where the conversation gets interesting. Two common objections arise when customers self-manage Kafka:

  • "We are not charged per partition. Why should we care?"
  • "We upgraded to KRaft. For us, the sky is the limit on partitions."

But just because extra partitions don't cost anything doesn't mean they are free.

Replicas, not throughput, set your broker count

The widely recommended ceiling is 4,000 to 6,000 partition replicas per broker, and that counts total replicas, not just partition leaders. With a replication factor of 3, a cluster with 100,000 partitions is really 300,000 replicas the cluster has to host, place, and track. Divide 300,000 by a conservative 4,000 replicas per broker and you need 75 brokers on partition count alone, regardless of how little data flows through them.

Better hardware raises the ceiling but does not remove it. AWS publishes a figure of around 12,000 replicas per broker for a well-tuned m7g.8xlarge (32 vCPU, 128 GB RAM). The point stands: on most clusters we look at, replica count, not throughput or storage, is the scaling bottleneck. That would bring the broker number down to 25. But a cluster pushing a couple hundred MB/s should run comfortably on just a handful of brokers if it were not carrying tens of thousands of relatively idle partitions.

Each broker brings a cost in terms of hardware, operations, and even license cost if you are using a vendor. Reducing from 25 brokers to, say, 4 brokers would be a big win.

100,000 partitions times replication factor 3 equals 300,000 replicas, divided by 4,000 per broker equals 75 brokers on partition count alone, versus about 4 brokers for the real throughput

KRaft helped the cluster, not the broker

KRaft is a genuine improvement. It raised the cluster-wide partition metadata ceiling, dramatically reduced the impact of large metadata on broker startup times, and dramatically reduced the response times for heavy metadata operations like partition reassignment and topic creation. But it did nothing for the per-broker cost of each partition:

  • More open file descriptors: Every partition replica is log segment files on disk.
  • More memory: Kafka holds an in-memory buffer per partition.
  • More random I/O: More partitions means more scattered writes and reads.
  • Higher latency: By default, a broker uses a single thread to fetch data from other brokers to keep its follower replicas up-to-date. As Jun Rao, one of Kafka's original authors, notes: "replicating 1000 partitions from one broker to another can add about 20 ms latency, which implies that the end-to-end latency is at least 20 ms."
  • Request handling: Even empty partitions are going to cause fetch requests for internal replication.

Extra partitions look free right up until you have to move a broker. Failure, restart, routine upgrade: every partition the broker led needs its leader re-elected and its replicas brought back into sync. The election is fixed work per partition, the same whether it carries 100 MB/s or nothing at all. An idle partition skips the data catch-up but still pays that election, so ten thousand idle partitions is ten thousand elections.

And it's worth saying, at the time of writing, many if not most of my self-managed customers are still running ZooKeeper clusters, which is even more reason to care about partition waste.

Partition efficiency is not an ancillary concern

On managed Kafka, waste is a direct charge on the bill. On self-managed Kafka, waste sets your broker count. Either way, partition waste often dominates the cost of running Kafka.

Partition waste leads to two cost models: on managed Kafka, per-partition-hour billing puts it on the invoice; on self-managed Kafka, replicas set your broker count

The good news is that this cost is mostly recoverable.

Four steps to address it with Conduktor

01

Self-Service

Establish ownership and enforce partition policies at creation time, so topics start right-sized instead of inheriting someone else's defaults. Guardrails stop the next wave of waste from being created. (Governed self-service)

02

Chargeback

Attribute cost back to the application owners who created the partitions. Once teams can see their own waste, most of them clean it up without being asked. (Kafka Chargeback)

03

Virtual cluster consolidation

Collapse sprawling non-prod physical clusters onto shared infrastructure with virtual clusters, removing whole clusters' worth of per-broker partition overhead.

04

Topic concentration

Fold many low-throughput topics (dead-letter queues, non-prod scratch topics) onto fewer physical partitions, eliminating the long tail of idle replicas that drives broker count.

Steps 1 and 2 are available through Conduktor Console. Steps 3 and 4 are advanced capabilities of the Conduktor Gateway, our Kafka-protocol-aware proxy.

You can't shrink a topic's partition count in place. An over-partitioned topic has to be migrated onto a right-sized one, and topic concentration doesn't change that. What it changes is the destination: set the concentration rule up ahead of time, and matching low-throughput or non-prod topics fold onto shared physical partitions as they migrate.

Measure for yourself

You do not have to take the 40-to-70-percent figure on faith. Install Conduktor Console Community Edition and open Insights -> Cost control to see your empty, tiny, and stale topics. Most teams are surprised by the answer. Once you can see it, you can cut it.