Skip to content

Conversation

@punit-kulal
Copy link

@punit-kulal punit-kulal commented Mar 21, 2023

linger.ms defines the maximum interval for buffering messages in the client before being produced to kafka. This improves efficiency of the producer and also greatly reduces the load on kafka.

The interval and buffering also ensures better compression when enabled.

More Reading: https://github.com/confluentinc/librdkafka/blob/master/INTRODUCTION.md#performance

Open Questions:

  • Other configuration related to kafka sink are similar to flags, would you want this configuration be also similar to switch where the recommended config(1000) is behind a flag and in the code or should it be similar to the min_commit interval config we have for firehose.

Todo:

  • Update version

Copy link
Member

@lavkesh lavkesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@punit-kulal punit-kulal changed the title WIP: Add support for kafka producer config linger.ms in kafka sink Add support for kafka producer config linger.ms in kafka sink Mar 28, 2023
Description: Add three constants to support linger ms. The default value set is 5 which is the default for the kafka producer client.
Description: Integrate configuration support for linger ms in the kafka producer initialisation.
Description: The default of 5 ms is for librd kafka clients. Java client has default as 0. Hence, updating it.
Description: Import reference now use the static constant class.
@punit-kulal punit-kulal force-pushed the kafka-sink-optimisation branch from 09c259f to e1b0079 Compare March 28, 2023 04:30
sumitaich1998

This comment was marked as outdated.

@sumitaich1998 sumitaich1998 dismissed their stale review April 20, 2023 10:14

some changes required


#### `SINK_KAFKA_LINGER_MS`

Defines the max interval the sink should we wait for the sink/producer buffer to fill

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just add "in milliseconds" after "max interval"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

Copy link

@sumitaich1998 sumitaich1998 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, shouldn't we also add the batch.size config for the kafka producer along with linger.ms?

@punit-kulal
Copy link
Author

punit-kulal commented Apr 21, 2023

also, shouldn't we also add the batch.size config for the kafka producer along with linger.ms?

The default batch.size works well for most cases. Reason for custom batch.size config would be ?

@sumitaich1998
Copy link

also, shouldn't we also add the batch.size config for the kafka producer along with linger.ms?

The default batch.size works well for most cases. Reason for custom batch.size config would be ?

just so that we can tune it later for a particular dagger if we want to, we can just keep the default value in odin, we wont expose it in UI

@punit-kulal
Copy link
Author

also, shouldn't we also add the batch.size config for the kafka producer along with linger.ms?

The default batch.size works well for most cases. Reason for custom batch.size config would be ?

just so that we can tune it later for a particular dagger if we want to, we can just keep the default value in odin, we wont expose it in UI

That can be out of the scope of the PR and can be picked independently. The scope of this PR is to expose this certain config and this helps with efficient produce of messages. There are a multiple other producer configs present as well which are currently not exposed. And can be optionally exposed in a future PR. WDYT ?

@sumitaich1998
Copy link

also, shouldn't we also add the batch.size config for the kafka producer along with linger.ms?

The default batch.size works well for most cases. Reason for custom batch.size config would be ?

just so that we can tune it later for a particular dagger if we want to, we can just keep the default value in odin, we wont expose it in UI

That can be out of the scope of the PR and can be picked independently. The scope of this PR is to expose this certain config and this helps with efficient produce of messages. There are a multiple other producer configs present as well which are currently not exposed. And can be optionally exposed in a future PR. WDYT ?

yeah ok should be fine

Copy link

@sumitaich1998 sumitaich1998 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@sumitaich1998 sumitaich1998 merged commit c53192e into goto:main May 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants