# When Kafka Exactly-Once Semantics Are Worth the Performance Cost

I've seen teams enable exactly-once semantics (EOS) because "duplicates are bad" without measuring the cost. Then they wonder why throughput dropped 40% and commit latency tripled.

The uncomfortable truth: most applications don't need EOS. At-least-once with idempotent consumers is simpler, faster, and sufficient for nearly every real-world use case.

> *We enabled transactions for 'safety.' Throughput dropped from 100k to 60k msg/s. Then we realized our consumers write to PostgreSQL anyway—EOS didn't help. We switched to idempotent consumers and got our performance back.*
>
> *Platform Engineer at a payments company*

## The Hidden Cost

Kafka's transactional exactly-once has overhead rarely discussed in marketing:

| Mode | Throughput | Latency p99 |
|------|------------|-------------|
| At-least-once (`acks=all`) | 100k msg/s | 15ms |
| Idempotent producer | 98k msg/s | 16ms |
| **Transactional** | 60k msg/s | 45ms |

That's a 40% throughput reduction. Every transaction requires coordination with `__transaction_state`. Commit latency adds 10-50ms per batch.

The `read_committed` consumer tax: consumers using `isolation.level=read_committed` can't advance past the oldest open transaction. One stuck producer blocks all consumers on affected partitions for `transaction.timeout.ms` (default: 60 seconds). [Consumer group monitoring](https://docs.conduktor.io/guide/monitor-brokers-apps) surfaces these stalls before they cascade.

**Cascading failure risk:** If many producers across many partitions encounter errors simultaneously (network partition, downstream outage), your entire consumer fleet can stall. This cascade has caused multi-hour outages. Always set `transaction.timeout.ms` ≤ 15 seconds in production and monitor transaction coordinator lag.

## The Real Exactly-Once Boundary

Kafka's EOS guarantees atomicity for:
- Writes to multiple partitions
- Consumer offset commits
- Kafka Streams state stores

It does NOT guarantee exactly-once for:
- Database writes
- REST API calls
- Any external system

```java
producer.beginTransaction();
producer.send(new ProducerRecord<>("orders", key, order));
httpClient.post("https://payments.example.com/charge", order);  // NOT transactional
producer.commitTransaction();
// If HTTP succeeded but commit fails: customer charged, no Kafka record
```

If your consumer writes to PostgreSQL, Redis, or any external system, EOS doesn't help. You need idempotent consumers anyway.

## The Idempotent Consumer Alternative

An idempotent consumer produces the same result whether a message is processed once or multiple times. This works for 95% of use cases where you write to external systems.

```java
public void processOrder(ConsumerRecord<String, OrderEvent> record) {
    jdbcTemplate.update("""
        INSERT INTO orders (order_id, customer_id, total, status)
        VALUES (?, ?, ?, ?)
        ON CONFLICT (order_id) DO NOTHING
        """,
        record.value().getOrderId(),
        record.value().getCustomerId(),
        record.value().getTotal(), "PENDING");
}
```

Process the same message 10 times? One row in the database. No transactions, no coordinator overhead, no blocked consumers.

**Design for idempotency:**

```sql
-- State replacement (idempotent)
UPDATE inventory SET quantity = 50 WHERE product_id = 'SKU-123';

-- Delta operation (NOT idempotent)
UPDATE inventory SET quantity = quantity - 1 WHERE product_id = 'SKU-123';
```

Carry full state in events, not deltas. Duplicates become harmless.

## When EOS Actually Matters

Enable exactly-once when ALL of these are true:

1. **Pure Kafka-to-Kafka processing** — no external systems
2. **Atomic multi-partition writes required** — one message fans out to several topics
3. **Duplicates break business logic** — can't be handled by idempotent consumers
4. **Latency budget permits** — you can absorb 10-50ms commit overhead

The legitimate use case: Kafka Streams with stateful processing:

```java
props.put(StreamsConfig.PROCESSING_GUARANTEE_CONFIG, StreamsConfig.EXACTLY_ONCE_V2);
```

This keeps state stores consistent with output topics. Kafka Streams handles the complexity internally.

## Decision Framework

| Situation | Recommendation |
|-----------|----------------|
| Writing to database | Idempotent consumer |
| Calling external APIs | Idempotent consumer |
| Kafka Streams with state | `exactly_once_v2` |
| Multi-topic atomic writes | Transactions |
| Logging, metrics, analytics | At-least-once |
| "Duplicates are bad" | Idempotent consumer |

The last row is intentional. "Duplicates are bad" isn't a reason to use EOS. It's a reason to make your consumers idempotent.

Stop reaching for exactly-once as the default. Start with at-least-once and idempotent consumers. Add EOS only when you've proven you need it and can afford the overhead.

[Book a demo](https://www.conduktor.io/contact/demo) to see how Conduktor Console shows transaction states and helps identify when EOS overhead isn't paying off.
