How to send Large Messages in Apache Kafka?
The Kafka max message size is 1MB. In this lesson we will look at two approaches for handling larger messages in Kafka.
Kafka has a default limit of 1MB per message in the topic. This is because very large messages are considered inefficient and an anti-pattern in Apache Kafka.
Yet, you may need to send large messages in Apache Kafka.
There are two approaches to sending large messages in Apache Kafka:
Modify your client to send large messages (example video files) outside of Kafka and only send to Kafka a reference to these message. This involves extra logic on your end but could prove quite efficient.
The store of your large message could be a cloud store such as Amazon S3, or an on-premise large file storage system such as a network system or HDFS.
To date there is no library that exists that performs this functionality out of the box, but it shouldn't be too complicated to engineer. Ensure you have written both a custom producer and consumer.
Here we need to modify topic, producer and consumer configurations to allow for a bigger message size.
It is recommended to leave the max message size default for the Kafka brokers and only override this at the topic level through topic-level configurations.
The broker-side setting is message.max.bytes
and the topic-side setting is max.message.bytes
Let's create a topic named large-message
1
kafka-topics.sh --bootstrap-server localhost:9092 --create --topic large-message --partitions 3 --replication-factor 1
And add the necessary max.message.bytes
configuration for 10MB
1
2
3
4
kafka-configs.sh --bootstrap-server localhost:9092 \
--alter --entity-type topics \
--entity-name configured-topic \
--add-config max.message.bytes=10485880
Now our topic is created and configured to receive large messages, but this is not enough/
You must also set the setting replica.fetch.max.bytes=10485880
so that your brokers can replicate the large messages correctly. This setting can only be set in the Kafka config files server.properties
and must require a broker restart.
You must also change the max.partition.fetch.bytes
configuration on the consumer side and your consumer clients. If this value is smaller than message.max.bytes
the consumer will fail to fetch these messages and will get stuck on processing, which is very undesirable.
To set this in your CLI, you can use --consumer-property
:
1
2
3
4
kafka-console-consumer.sh --bootstrap-server localhost:9092 \
--topic large-message \
--from-beginning \
--consumer-property max.partition.fetch.bytes=10485880
Or in your Java code:
You must change the property max.request.size
producer-side to ensure large messages can be sent.
To set this in your CLI, you can use --producer-property
:
1
2
3
kafka-console-producer.sh --bootstrap-server localhost:9092 \
--topic large-message \
--producer-property max.request.size=10485880
Or in your Java code:
Conduktor & Advanced Topic Configuration
You can use workarounds for large messages, but why not just solve the problem completely? Conduktor Platform's cold storage capabilities mean you don't need to be limited in the size of your messages or the time you retain them! Try it now!