Consumer auto offset reset behavior
Kafka auto.offset.reset controls where a consumer starts when no committed offset exists. Compare earliest vs latest vs none with real failure scenarios.
Learn how to configure consumer offset reset behavior
When a consumer starts without committed offsets, or when committed offsets are invalid, Kafka needs to know where to start reading. The auto.offset.reset configuration controls this behavior and is critical for understanding data processing guarantees.
What you'll learn:
- The three auto offset reset options and when each is triggered
- How to choose the right setting for your use case
- Best practices for production and development environments
- How to handle offset reset scenarios programmatically
Auto offset reset options
When a Kafka consumer starts and there are no committed offsets for its consumer group, or when the committed offset is no longer valid (e.g., because the data has been deleted), the consumer needs to decide where to start reading from. This behavior is controlled by the auto.offset.reset configuration.
earliest
auto.offset.reset=earliest - Consumer will start reading from the beginning of the partition
- Reads all available messages from the earliest available offset
- Useful for reprocessing all historical data
- Use case: Data migration, audit requirements, complete reprocessing
latest (default)
auto.offset.reset=latest - Consumer will start reading from the end of the partition
- Only processes new messages produced after the consumer starts
- Use case: Real-time processing where historical data is not needed
none
auto.offset.reset=none - Consumer throws an exception if no previous offset is found
- Forces explicit offset management
- Use case: Strict control over consumer behavior, prevents accidental data loss or reprocessing
Decision guide
When auto offset reset is triggered
The auto.offset.reset behavior is triggered in these scenarios:
| Scenario | Description | Example |
|---|---|---|
| New consumer group | First time a consumer group subscribes to a topic | Deploying a new application |
| Invalid offset | Committed offset no longer exists (data deleted due to retention) | Consumer offline longer than retention period |
| Offset out of range | Committed offset is beyond the current log boundaries | Log truncation or corruption |
Common scenarios
Scenario 1: New consumer group
// First time this consumer group runs
Properties props = new Properties();
props.put("group.id", "new-consumer-group");
props.put("auto.offset.reset", "earliest"); // Will read from beginning Scenario 2: Data retention cleanup
// Consumer was offline for too long, committed offset expired
// Behavior depends on auto.offset.reset setting
Properties props = new Properties();
props.put("group.id", "existing-group");
props.put("auto.offset.reset", "latest"); // Will skip to latest Best practices
For production systems
# Be explicit about offset reset behavior
auto.offset.reset=latest
# Enable offset commits
enable.auto.commit=true
auto.commit.interval.ms=5000 For development/testing
# Often want to reprocess data
auto.offset.reset=earliest
# May want manual control
enable.auto.commit=false For critical data processing
# Prevent accidental data loss or reprocessing
auto.offset.reset=none
# Handle exceptions explicitly in code Error handling example
Properties props = new Properties();
props.put("auto.offset.reset", "none");
KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
try {
consumer.subscribe(Arrays.asList("my-topic"));
while (true) {
ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(1000));
// Process records
}
} catch (NoOffsetForPartitionException e) {
// Handle case where no valid offset exists
// Decide whether to seek to beginning or end
consumer.seekToBeginning(consumer.assignment());
// or consumer.seekToEnd(consumer.assignment());
} Offset management strategies
Automatic offset management
- Use
enable.auto.commit=true - Set appropriate
auto.commit.interval.ms - Choose suitable
auto.offset.resetpolicy
Manual offset management
- Use
enable.auto.commit=false - Call
commitSync()orcommitAsync()after processing - Handle offset reset scenarios explicitly
External offset storage
- Store offsets in external systems (database, file system)
- Use
seek()methods to position consumer - Implement custom offset management logic
Data loss vs duplication
auto.offset.reset=latestcan cause data loss if messages arrive while consumer is downauto.offset.reset=earliestcan cause message duplication if consumer group is recreatedauto.offset.reset=nonerequires explicit error handling but provides the most control
Configuration recommendations
| Use case | auto.offset.reset | enable.auto.commit | Notes |
|---|---|---|---|
| High-throughput | latest | true | Accept potential data loss for speed |
| Critical data | none | false | Manual control, handle exceptions |
| Replay scenarios | earliest | false | Process all historical data |
| Development | earliest | true | Easy testing with full data |
See it in practice with Conduktor
Conduktor Console lets you monitor consumer group offsets and lag in real-time. Identify when offset resets occur and track consumer position across partitions to validate your offset management strategy.
Next steps
- Read from the closest replica to cut cross-datacenter latency
- Understand delivery semantics for reliable processing
- Configure consumer settings for optimal performance