« Previous 1 2
Kafka: Scaling producers and consumers
Smooth
Going Deeper
Once you get further into producer tuning, the configurations start to get more interrelated, with some important non-linear and sometimes unexpected effects on performance. It pays to be extra patient – and scientific – about combinations of different parameters. Remember, you should be continually going back to understanding the root bottleneck while keeping an eye on optimizing the rate of records flowing through the producer.
The next questions to ask are: How big are your records (as Kafka sees them, not as you think they are)? Are you making "good" batches?
The size of the batch is determined by the batch.size
configuration, which is the number of bytes after which the producer will send the request to the brokers, regardless of the linger.ms
value. Requests sent to brokers will contain multiple batches – one for each partition.
A few other things you need to check include the number of records per batch and their size. Here is where you can start really digging into the kafka.producer
MBean. The batch-size-[avg|max]
can give you a good idea of the distribution of the number of bytes per batch, and record-size-[avg|max]
can give you a sense of the size of each record. Divide the two and you have a rough rate of records per batch. Match this to the batch.size
configuration and determine approximately how many records should be flowing through your producer. You should also sanity check this against the record-send-rate
– the number of records per second – reported by your producer.
You might be a bit surprised if you occasionally have very large messages, for which you should check record-size-max
, because the max.request.size
configuration will limit the maximum size (in bytes) of a request and therefore inherently limit the number of record batches, as well.
What about the time you are waiting for I/O? Check out the io-wait-ratio
metrics to see where you are spending time. Is the I/O thread waiting or are your producers processing?
Next, you need to make sure that the client buffer is not getting filled. Each producer has a fairly large buffer to collect data that then gets batched and sent to the brokers. In practice, I have never seen this to be a problem, but it often pays to be methodical. Here, the metric buffer-available-bytes
is your friend, allowing you to ensure that your buffer.memory
size is not being exhausted by your record sizes, batching configurations from earlier, or both.
Producing too many different topics can affect the quality of compression, because you can't compress well across topics. In that case, you might need some application changes so that you can batch more aggressively per destination topic, rather than relying on Kafka to just do the right thing. An advanced tactic would be to check the bytes-per-topic metrics from the producer, so you should only consider doing so after benchmarking and making sure other adjustments are not helping.
Wrap-Up
The configurations and metrics to tweak on the producer to get high throughput are summarized in Table 1. At this point, you should have all the tools you need to scale up your client instances. You know the most important optimization switches, some guidelines for adjusting garbage collection, and the nice round robin trick for balancing consumer groups when the consumers encounter differently partitioned topics. For slow producers, apply standard optimizations for compression and idle time or dive into the depths of the producer configuration to find out what really happens to entries and stacks.
Table 1
Producer Tuning Summary
Config/Metric | Comment |
---|---|
compression.type
|
Test on your data. |
linger.ms
|
Check the average time a batch waits in the send buffer (how long it takes to fill a batch) with record-queue-time-avg .
|
batch.size
|
Determine records per batch, bytes per batch (batch-size-avg , batch-size-max ), and records per topic per second (record-send-rate ) and check your bytes per topic.
|
max.request.size
|
Limit the number and size of batches (record-size-max ).
|
Time spent waiting for I/O | Are you really waiting (io-wait-ratio )?
|
buffer.memory + queued requests
|
32MB default (roughly total memory by producer) allocated to buffer records for sending (see buffer-available-bytes ).
|
The tips in this article should give you a bit more guidance beyond the raw documentation in the Kafka manual for how to go about removing bottlenecks and getting the performance out of all parts of your streaming data pipelines that you know you should be getting.
Infos
- Apache Kafka: https://kafka.apache.org
- Producer performance tuning for Apache Kafka: https://www.slideshare.net/JiangjieQin/producer-performance-tuning-for-apache-kafka-63147600
« Previous 1 2
Buy this article as PDF
(incl. VAT)
Buy ADMIN Magazine
Subscribe to our ADMIN Newsletters
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Most Popular
Support Our Work
ADMIN content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.