If Kafka Credentials Leak, Does the Attacker Get Plaintext or Ciphertext?

Stéphane Derosiaux June 12, 2026 9 min read
Isometric line-art of a credential key connected to a Kafka broker cluster, with the read path between them highlighted in lime green
TLDR
  • TLS stops interception on the wire. The broker terminates it and works on plain bytes from there.
  • At-rest encryption stops stolen disks, copied snapshots, decommissioned volumes.
  • BYOK (Bring Your Own Key) wraps the storage keys with a key in your own KMS: revoke it and the whole cluster becomes unavailable.
  • A leaked credential authenticates to the Kafka API and reads every topic its ACLs allow, in plaintext.
  • Ciphertext requires encrypting the payload before it reaches Kafka, with keys the cluster never holds.
  • Comply with FedRAMP High or HIPAA on managed Kafka without replatforming.

Ask this question about your own Kafka platform: if a producer or consumer credential leaked today, what would the person holding it actually read?

We ask it in discovery calls, and the answer is almost always "we have TLS, and the volumes are encrypted". Sometimes BYOK on top, so the at-rest keys are wrapped by a key in your own KMS. Great.

Now think about this scenario: the attacker isn't on your network path. They don't have access to your disk. The only thing they have: a bootstrap address and a credential stolen from some GitHub repo or from the wiki. That's it. The broker will happily serve readable data and there you have a data leak.

🚫 "We have TLS in transit and encryption at rest. We're covered."

These are good and necessary practices. But the one that's missing makes them almost useless here: encrypting the data itself, end-to-end, no matter what's in the middle.

How Kafka credentials leak

Credentials can leak in so many ways.

  • a SASL password in a client.properties file that gets committed and reverted (too late)
  • a connection string or Kafka properties present in CI or application logs (so common)
  • someone sending credentials in Slack, where they stay forever
  • a service account shared across teams because requesting a new one takes too long
  • a service account being reused in some Kafka tooling, outside the scope of the platform team
  • passwords and keytabs living on a contractor's laptop after the contract ends
  • ...

Then you realize the scale: GitGuardian counted 28.65 million new hardcoded secrets in public GitHub commits in 2025, and nearly 70% of the credentials it confirmed valid in 2022 were still valid in January 2025.

Long-lived credentials must be banned in favor of short-lived ones: OAuth tokens that expire in minutes, IAM-based auth with no static password at all, mTLS certificates on automatic rotation. They reduce how long a leaked secret works. They don't change what it can read while it works, and consuming a topic takes seconds. Static SCRAM passwords and long-lived API keys are still the default in most setups we see.

The encryption stack most people think about is built on old assumptions, or just lack of awareness.

What each encryption layer defends against

  • TLS defends against someone on the network path: a man-in-the-middle, a network sniffer. The session is encrypted from the clients to the brokers, which terminate TLS and work on plain bytes from there (to process and store the data). It protects the wire, not the machine: anyone on the broker host is already past it.
  • At-rest encryption defends against someone who accesses the storage: a decommissioned volume, a snapshot copied out of an account, a physical disk. AWS MSK describes its KMS integration as "transparent server-side encryption". The service decrypts on read, so data is plaintext by the time it is served to an authenticated client; the same is true on every provider and on self-hosted volumes alike. It protects the disk that leaves the building, not the running machine. Log onto a live broker and the data is yours: kafka-dump-log reads the segments straight off disk, or a console-consumer with superuser rights drains the topics. Plaintext, by construction.
  • BYOK adds two things to at-rest encryption: custody (the storage keys are wrapped by a key you control in your own KMS) and revocation (pull the key and the whole cluster becomes useless because the brokers can't read the segments anymore). It stops the entire cluster; it cannot stop one topic or one consumer.

Three attackers approach Kafka: a network attacker is stopped by TLS, a disk thief is stopped by at-rest encryption and BYOK, and a leaked credential passes through the Kafka API unchallenged and reads plaintext

The Kafka API read path has no encryption layer on it. Why would it? Its job is the opposite: serve readable data to authorized clients.

What can you do with leaked credentials?

First, the attacker needs to reach your bootstrap servers:

  • Many managed clusters expose public endpoints
  • Inside a VPC, only workloads within that VPC can reach the cluster (a Kubernetes environment, for instance), not people or tooling outside of it

If the attacker gets such access, they can consume from the earliest offset of all accessible topics. Each topic has its own retention:

  • The attacker gets a week or a month of your traffic... or with tiered storage enabled, years of it.
  • On a compacted topic, the current state of every key. Basically, the whole customer table or order table in one go.

If the attacker is smart, they won't use consumer groups detectable in dashboards, but will assign partitions directly: no group to join, no offset to commit. Invisible.

Only the thin layer of ACLs scopes the damage, and that scope is usually far wider than people think: one service account often reads dozens of topics.

The invisible reader: break-glass access

A leaked credential is a reader you didn't authorize. Managed services come with one you can't even see: the provider's own operators.

Every managed service keeps a break-glass path, privileged access for emergencies, recovery, support. That's expected: it's how someone brings your cluster back when something goes wrong. But it means a superuser exists inside the provider's boundary, and at-rest encryption doesn't stop them: the infrastructure decrypts on read, so the serving path is plaintext to a privileged operator too. BYOK doesn't change it. The storage key is yours, but the decryption still happens in the provider's infra.

Providers wrap that access in controls and audit, but a loophole is always possible. The only way to close it for good is to encrypt the payload before it reaches Kafka. The key lives in your KMS, the provider never holds it, so even break-glass access gets ciphertext. You stop having to trust the most privileged account in someone else's cloud.

Summary: The five levels of Kafka encryption

When you hear "We already encrypt Kafka", ask: what does it truly mean?

LevelSetupDesigned againstA leaked credential reads
1TLS onlyInterception on the wirePlaintext
2TLS + at-rest encryption+ stolen disks, storage accessPlaintext
3TLS + at-rest + BYOK+ key custody, provider offboardingPlaintext
4Payload encrypted before Kafka+ leaked and overprivileged credentialsCiphertext, unless it can decrypt
5Level 4 + record signing + your own audit log+ tampering, and auditors asking for proofCiphertext, tamper-evident
Most of the platforms we help sit at level 2 or 3. Rarely 4, almost never 5.

Levels 1 to 3 protect the infrastructure around the data. Level 4 protects the data itself, closest to it: the protection travels WITH the record.

Now, to decrypt data, you need to be on an approved list of identities, on top of the usual ACLs stored in the infra itself. This second list lives outside the low-level Kafka infra, often at an independent proxy layer, and it goes down to the field and the key: two consumers with identical ACLs can still see different things, one reading a card number in clear, the other getting it masked.

And if the encryption key itself leaks? The blast radius is bounded by design. A key maps to a defined set of topics or fields, so a leaked key opens that scope and nothing else, and rotating it re-scopes what it can ever unlock. A leaked storage or BYOK key has one scope: the whole cluster.

Three places to encrypt before Kafka

Level 4+ means the record is encrypted before any broker sees it, with keys held in a KMS the cluster can't reach. Multiple options:

  • In every application. Producers encrypt, consumers decrypt, each team accesses the KMS in its own language and its own way. It works at small scale only, as it's difficult to change everything (think Connect workers, Flink jobs, partner applications, etc.). We wrote about it: Stop Building Kafka Encryption Libraries.
  • In the Kafka (de)serializer. Every producing and consuming application can configure a custom (de)serializer that calls the KMS to encrypt or decrypt the bytes. It still needs to be written and maintained in each language, along with the schema registry setup that goes with it (Avro, Protobuf). Same issues as before.
  • At the proxy. Conduktor Gateway is a Kafka-protocol proxy: producers and consumers connect to it unchanged. No SDK, no broker change, fully transparent. That includes the clients you don't control (Connect workers, Flink jobs, partner applications). It encrypts on produce (full payload or per field) and decrypts on fetch for the service accounts you authorize, with keys in your own Vault, KMS, or Fortanix.

Cherry on top: the proxy can also sign records on produce and verify on fetch, and write an audit log to a topic you own. That covers level 5.

The proxy becomes the trust boundary. It handles plaintext in memory and holds KMS credentials, so it needs the same operational care as your KMS. It is stateless, so high availability is a matter of running replicas.

Regulations & encryption

For health data, plaintext vs ciphertext makes a massive difference: ePHI encrypted to the HHS guidance, with keys that weren't compromised, can fall within the breach safe harbor and may not trigger notification at all, depending on the facts. Instead of trying to answer "what did they read?", the question becomes simpler: "who has access to the key?"

This is similar to our FedRAMP story where the customer kept their managed Kafka provider for what it is good at, availability, and moved confidentiality, integrity, and key custody inside their own boundary at the proxy level.

You don't need to rule out managed Kafka and run everything on-prem because you don't want your data to leave your boundary in readable form: apply payload encryption inside your boundary before sending the data to your managed Kafka provider. They run the brokers and own the SLA; what they store is ciphertext under your keys.


Related: FedRAMP High for Kafka Without Replatforming → · Stop Building Kafka Encryption Libraries → · Kafka Encryption with Gateway →