Kafka Data Contracts: Prevent Breaking Changes

Data contracts prevent breaking changes in Kafka. Enforce schema compatibility, versioning, and migration rules before bad data ships.

Stéphane Derosiaux · December 23, 2025 ·

Kafka Data Contracts: Prevent Breaking Changes

A breaking schema change at 3 AM is indistinguishable from a production outage.

Producers deploy a new schema version that removes a field. Consumers built against the old schema expect that field and crash when it's missing. Messages pile up unprocessed. Lag grows from zero to millions. Alerts fire. Engineers wake up to fix a problem that schema validation should have prevented six hours earlier during deployment.

This happens because schema evolution isn't treated as a contract. It's treated as implementation detail. Producers change schemas when it's convenient. Consumers discover the changes when they break. The missing piece is enforcement: data contracts that define compatibility rules and reject breaking changes before they reach production.

Real data contracts have three components: schema definition (structure), compatibility mode (evolution rules), and validation (enforcement). Most teams have schema definitions. Few enforce compatibility rules. Fewer still treat compatibility as a blocking requirement—changes that break compatibility should fail deployment, not cause incidents.

What Data Contracts Actually Are

Data contracts are more than schemas stored in a registry. They're enforceable agreements between producers and consumers about data structure, evolution rules, and SLAs.

Schema definition describes message structure: fields, types, and constraints. Avro, Protobuf, and JSON Schema are common formats. The schema is machine-readable (code generation, validation) and human-readable (documentation).

Compatibility mode defines which schema changes are allowed without breaking consumers. Four modes exist:

BACKWARD compatibility lets you delete fields or add fields with defaults. Consumers built against the old schema can read messages produced with the new schema. Use this when consumers upgrade slowly.
FORWARD compatibility lets you add fields or delete fields with defaults. Consumers built against the new schema can read messages produced with the old schema. Use this when producers upgrade slowly.
FULL compatibility combines both: schemas can add or delete fields as long as they have defaults. Both old and new producers/consumers work together. This is the safest mode but most restrictive.
NONE compatibility allows any schema change. Use this only when you control both producer and consumer deployments and can coordinate breaking changes.

Validation enforces compatibility at deployment time through data quality policies. Schema Registry provides validation APIs, but enforcement requires integration: CI/CD pipelines that fail builds if schema compatibility breaks, GitOps workflows that reject pull requests with incompatible schemas, or runtime validation that rejects messages not matching registered schemas.

The contract is: "If you produce to this topic, your schema must be compatible with registered schemas according to the configured mode. If it's not, your deployment fails."

Compatibility Modes and When to Use Each

The choice of compatibility mode determines how schemas evolve safely.

BACKWARD compatibility works when consumers lag behind producers. In real-world deployments, producers often upgrade first (one service, easy coordination), then consumers upgrade gradually (many services, slow rollout). BACKWARD compatibility means old consumers can read new messages, so producers can deploy without waiting for all consumers to upgrade.

Example: A schema starts with {userId: string, email: string}. BACKWARD compatibility allows adding a new field with default: {userId: string, email: string, phoneNumber: string = ""}. Old consumers ignore the new field. New consumers read it. No breaking change.

BACKWARD compatibility doesn't allow removing required fields. If the schema removes email, old consumers expecting it will fail. Removing fields requires either making them optional first (two-phase change) or migrating to a new topic.

FORWARD compatibility works when consumers upgrade first. This is less common but occurs in data warehouse scenarios: consumers (analytics pipelines) upgrade to handle new fields, then producers start emitting them.

Example: A schema starts with {orderId: string, amount: number}. FORWARD compatibility allows adding a new required field in the schema definition while ensuring old producers (still sending the old format) don't break new consumers. New consumers must handle missing fields gracefully.

FULL compatibility is safest when upgrade order is unpredictable or when both producers and consumers upgrade independently. This is common in microservice architectures where release schedules don't coordinate.

The restriction: fields can only be added or deleted if they have default values. This ensures old and new versions interoperate regardless of upgrade order.

NONE compatibility disables schema compatibility checks. Use this only in controlled environments where a single team owns producer and consumer and can coordinate breaking changes. In multi-team environments, NONE compatibility guarantees eventual breaking changes that cause incidents.

Breaking Change Scenarios

Understanding what breaks compatibility prevents production incidents.

Deleting a required field without defaults breaks BACKWARD compatibility. If old consumers expect the field and it disappears, deserialization fails. Solution: make the field optional (add default), deploy consumers to stop depending on it, then remove it in a later schema version.

Adding a required field without defaults breaks FORWARD compatibility. If old producers don't send the field but new consumers expect it, deserialization fails. Solution: add the field with a default value, so old producers send the default and new consumers function correctly.

Changing field types almost always breaks compatibility. Changing amount from integer to string is a breaking change—consumers expecting integer arithmetic will fail. Solution: add a new field (amountString), migrate consumers to use it, then deprecate the old field.

Renaming fields is a breaking change even if the type stays the same. Kafka doesn't understand "this field was renamed"—it sees field deletion (the old name) and field addition (the new name). Solution: add the new field, populate both during transition, migrate consumers, remove the old field.

Removing default values from optional fields breaks BACKWARD compatibility. If a field was optional with default "unknown" and you remove the default, old consumers won't deserialize correctly. Solution: don't remove defaults from optional fields.

Migration Rules for Incompatible Changes

Confluent introduced migration rules that transform data between incompatible schema versions. Instead of blocking breaking changes entirely, migration rules let you evolve schemas incompatibly while maintaining consumer compatibility.

How it works: A migration rule contains code that transforms data from one schema version to another. When consumers read messages, the rule runs automatically at deserialization time, translating old schema data into new schema format.

Example: Schema v1 has {firstName: string, lastName: string}. Schema v2 changes to {fullName: string}. This is incompatible—removing two fields and adding one breaks both BACKWARD and FORWARD compatibility.

Migration rule transforms v1 to v2:

fullName = firstName + " " + lastName

Consumers built against v2 schema read messages produced with v1 schema. The migration rule transforms {firstName: "Jane", lastName: "Doe"} into {fullName: "Jane Doe"} transparently.

Migration rules support incompatible evolution that would otherwise require:

Creating a new topic (data duplication, consumer migration)
Dual-writing to old and new schemas (producer complexity, data consistency risk)
Coordinated flag-day deployment (high coordination overhead, rollback difficulty)

Limitations: migration rules add deserialization overhead (transformation runs for every message) and increase complexity (rules are code that needs testing and maintenance). Use them for genuine breaking changes that can't be avoided, not as a shortcut around good schema design.

Compatibility Groups

Compatibility groups allow controlled breaking changes by organizing schema versions into isolated groups. Instead of enforcing compatibility across all schema versions, compatibility checks apply only within the same group.

Use case: You need a breaking change that can't be handled by migration rules. Instead of creating a new topic, create a new compatibility group for the schema. Schema v4 in group "legacy" and schema v5 in group "v2" are not checked for compatibility with each other.

Producers and consumers coordinate which group they use. Legacy consumers use group "legacy" and read schemas v1-v4. New consumers use group "v2" and read schemas v5+. Both groups coexist on the same topic.

This enables gradual migration: new consumers start reading with v5 schema, legacy consumers continue with v4, and eventually, legacy consumers are retired.

Trade-off: compatibility groups add coordination overhead and require clear documentation about which consumers use which groups.

Testing Schema Evolution

Schema evolution testing prevents compatibility breaks from reaching production.

Build-time validation checks compatibility during CI/CD. Maven and Gradle plugins for Schema Registry validate new schemas against registered versions:

# Fails build if schema breaks compatibility
mvn schema-registry:test-compatibility

This catches breaks before code review, not after deployment. If compatibility check fails, the build fails. No manual review needed—the tooling enforces the contract.

Staging environment testing validates end-to-end compatibility. Deploy producers with new schema to staging, verify existing consumers still function. Then deploy consumers with new schema, verify they handle both old and new messages correctly.

Test schema evolution by simulating schema changes in staging to verify compatibility with Kafka topics and downstream processing.

Versioning strategy testing verifies that schema upgrades follow safe patterns. If the strategy is "deploy consumers first, then producers" (for FORWARD compatibility), test this in staging: deploy consumer v2, verify it handles producer v1 messages, then deploy producer v2 and verify end-to-end.

Schema Evolution Best Practices

Treat schemas as code: Store schemas in Git, require code review for changes, version them alongside application code. Schema changes go through the same review process as code changes because their impact is equivalent.

Use BACKWARD or FULL compatibility by default: These modes prevent the most common breaking changes. Only use FORWARD if your deployment pattern requires it. Never use NONE in multi-team environments.

Add fields with defaults: When extending schemas, always provide default values. This maintains compatibility regardless of mode and makes evolution safe.

Never remove required fields directly: Make them optional first, deploy consumers that don't depend on them, then remove in a later version. This two-phase change prevents breaking consumers.

Deprecate before deleting: Mark fields as deprecated (via documentation or schema annotations), give consumers time to migrate away, then delete. Immediate deletion causes incidents.

Test compatibility in CI/CD: Automate schema validation. Manual review catches some issues but not all. Automated checks catch every incompatibility before merge.

Measuring Data Contract Health

Track three metrics: schema validation failure rate, compatibility break frequency, and schema-related incidents.

Schema validation failure rate measures how often producers attempt to publish messages that fail schema validation. High rates indicate producers aren't testing schemas before deployment. Target: under 1% failure rate.

Compatibility break frequency measures how often schema changes break compatibility checks in CI/CD. This should be rare—most changes should pass. Frequent breaks indicate teams aren't understanding compatibility rules. Provide training and improve documentation.

Schema-related incidents measures production outages caused by schema incompatibility. This should be zero. Every schema incident represents a failure of enforcement: either validation wasn't enabled, compatibility mode was too permissive, or deployment process bypassed checks.

The Path Forward

Kafka data contracts prevent breaking changes through enforced compatibility modes, automated validation, and testing. Schemas aren't documentation—they're binding agreements that deployment pipelines enforce.

Conduktor validates schema compatibility before deployment, provides visibility into schema evolution across topics, and integrates with CI/CD to block incompatible changes. Teams ship schema changes confidently because validation catches breaks before production.

If schema changes cause production incidents, the problem isn't your schemas—it's the lack of enforcement.