Kafka as a Database: When to Use Compacted Topics for State
Use Kafka compacted topics as a lightweight state store. Log compaction configuration, query limitations, and when to choose a real database instead.

Compacted topics turn Kafka from a message transport into a state layer. You get key-value semantics, durable storage, and built-in replication.
But Kafka is not a database. I've watched teams learn this the hard way—building query patterns that work in development and collapse in production.
We built our entire user profile system on compacted topics. Worked great until we hit 10 million users and every "lookup" required scanning from offset zero.
Tech Lead at a consumer app
How Compaction Works
Standard Kafka topics are append-only with time-based retention. Compacted topics retain the latest value for each key indefinitely.
kafka-topics --bootstrap-server localhost:9092 \
--create --topic user-profiles \
--config cleanup.policy=compact \
--config min.cleanable.dirty.ratio=0.1 You can also create and configure topics visually instead of managing CLI commands. Produce multiple updates for the same key. Before compaction, all messages exist. After compaction, only the latest survives.
What You Get (and Don't Get)
Compacted topics give you:
- Latest value per key
- Durability across broker failures
- Ordering within partition
- Tombstone support (delete by sending null)
- Replay capability from offset 0
Compacted topics don't give you:
- Point queries (
SELECT * FROM topic WHERE key = X) - Indexes
- Transactions across keys
- Read-after-write guarantees
The fundamental limitation: every "query" is a full topic scan.
The Pattern That Works: KTables
The architecture that makes compacted topics useful isn't reading them directly. It's materializing them into a local store.
KTable<String, UserProfile> users = builder.table(
"user-profiles",
Materialized.as("users-store")
);
// Fast local lookup
ReadOnlyKeyValueStore<String, UserProfile> store =
streams.store(StoreQueryParameters.fromNameAndType(
"users-store", QueryableStoreTypes.keyValueStore()));
UserProfile user = store.get("user-123"); // Milliseconds, not minutes The compacted topic is the source of truth. The local RocksDB store is a cache. On restart, Kafka Streams replays the topic to rebuild the store.
This is the "Kafka as database" pattern that actually works.
Configuration for State Stores
kafka-topics --bootstrap-server localhost:9092 \
--create --topic state-changelog \
--config cleanup.policy=compact \
--config min.cleanable.dirty.ratio=0.1 \
--config segment.ms=300000 \
--config max.compaction.lag.ms=86400000 | Parameter | Value | Effect |
|---|---|---|
min.cleanable.dirty.ratio | 0.1 | Compact when 10% is duplicates |
segment.ms | 300000 | Roll segments quickly |
max.compaction.lag.ms | 86400000 | Force compaction within 24h |
Good Fit vs Poor Fit
Good fit:
- CDC changelog topics (row state keyed by primary key)
- Configuration distribution
- Kafka Streams state stores
- Entity snapshots for downstream consumers
Poor fit:
- Point queries at scale
- Complex queries (filtering, joining)
- High-cardinality random access
- Low-latency reads without materialization
Common Errors
Null keys rejected:
Compacted topic cannot accept message without key Compaction requires keys. Every producer must set one.
Compaction not running: Check segment.ms. Compaction only runs on closed segments. Low-throughput topics may keep segments open for days.
The Hybrid Pattern
For production systems needing both Kafka's durability and database queries:
Producer → Compacted Topic → Kafka Streams → Local RocksDB
(truth) (materialize) (fast lookups) The topic is the log. Everything else is a derived view. If the downstream store fails, rebuild from the topic.
Compacted topics are powerful when used correctly. They're not a database replacement—they're a durable, replayable source of truth that feeds databases, caches, and local stores.
Book a demo to see how Conduktor Console provides visual configuration management and compaction metrics.