# Kafka Retention: When Messages Disappear and Why

"My retention is set to 10 minutes but I see messages from an hour ago." I hear this weekly. Kafka retention is simple in concept, confusing in practice.

The critical insight: **Kafka deletes segments, not messages.** The active segment (currently receiving writes) is never deleted, regardless of how old the messages inside it are.

> *We set 1-hour retention but ran out of disk after a week. Turns out our low-throughput topics had segment.ms at 7 days. Segments weren't rolling, so nothing got deleted.*
>
> *SRE at a logistics company*

## How Retention Actually Works

Kafka stores data in append-only log segments. Retention applies to closed segments only.

1. Background thread runs every 5 minutes (`log.retention.check.interval.ms`)
2. Identifies closed segments eligible for deletion
3. Marks for deletion after 1-minute delay

Your actual retention can exceed configured value by: segment roll time (up to 7 days default) + check interval (5 min) + deletion delay (1 min).

## Why Messages Persist Longer Than Expected

**Cause 1: Active segment never rolls**

Low-throughput topics may never fill a segment (default 1 GB) or age past `segment.ms` (default 7 days).

```bash
# For 1-hour retention, roll segments every 30 minutes
kafka-configs.sh --bootstrap-server localhost:9092 \
  --entity-type topics --entity-name my-topic \
  --alter --add-config retention.ms=3600000,segment.ms=1800000
```

**Cause 2: Segment contains old and new messages**

Segment deletion uses the **last message's** timestamp. A segment with messages from 10:00 and 10:50, with 1-hour retention, is eligible at 11:50. The 10:00 message persists 1 hour 50 minutes.

**Cause 3: Future timestamps**

Producer clock skew sends messages with future timestamps. Those segments won't be eligible until that future time passes.

**Fix:** Use broker-controlled timestamps:

```bash
kafka-configs.sh --bootstrap-server localhost:9092 \
  --entity-type topics --entity-name my-topic \
  --alter --add-config message.timestamp.type=LogAppendTime
```

## Time vs Size Retention

When both are configured, **the most restrictive wins**:

| Scenario | retention.ms | retention.bytes | Behavior |
|----------|--------------|-----------------|----------|
| Time only | 7 days | -1 (unlimited) | Delete when > 7 days old |
| Size only | -1 | 10 GB | Delete when partition > 10 GB |
| Both | 7 days | 10 GB | Delete if > 7 days OR > 10 GB |

**Important:** `retention.bytes` applies per partition. A topic with 10 partitions and 10 GB retention can use 100 GB total.

## Check Your Settings

You can verify retention settings via CLI or through a [topic management UI](https://docs.conduktor.io/guide/manage-kafka/kafka-resources/topics) that shows all topic configurations at a glance.

```bash
kafka-configs.sh --bootstrap-server localhost:9092 \
  --entity-type topics --entity-name my-topic \
  --describe --all | grep -E "(retention|segment)"
```

Note: Topic properties use `retention.ms`, not `log.retention.ms`. The `log.` prefix is broker-only.

## Common Mistakes

**Setting retention shorter than segment roll:**

```properties
# WRONG: Messages live at least 7 days despite 1-hour retention
retention.ms=3600000
segment.ms=604800000  # default: 7 days
```

**Forgetting retention.bytes is per partition:**

A topic with `retention.bytes=10GB` and 50 partitions can consume 500 GB.

**Not accounting for replication:**

Total storage = partition size × partitions × replication factor. A 100 GB topic with RF=3 uses 300 GB.

## Verify Segments on Disk

```bash
ls -la /var/kafka/data/my-topic-0/
# Look at file timestamps to see when segments closed
```

Retention is a minimum guarantee, not a maximum. Expect messages to live `retention.ms + segment.ms + check.interval` in the worst case. A cluster that "should" use 500 GB based on throughput × retention might actually use 800 GB due to segment timing.

[Book a demo](https://www.conduktor.io/contact/demo) to see how Conduktor Console shows storage usage, retention settings, and segment counts for every topic.
