-
Notifications
You must be signed in to change notification settings - Fork 95
[FLINK-38199] Add SinkUpsertMaterializerBenchmark #106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
src/main/java/org/apache/flink/benchmark/SinkUpsertMaterializerBenchmark.java
Outdated
Show resolved
Hide resolved
| @Param("100") | ||
| public int payloadSize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For micro benchmark purposes, like optimising code, guarding against regressions, the lower the payload, the better, as you get more contrasting effects of any performance related changes - optimisations/changes to the related code will be more clearly visible instead of being watered down by bytes copy.
So I would shrink it down, maybe to 10 or 20.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think both cases can be problematic; I'll change to a smaller value and think about keeping 100 (or increasing it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to
@Param({"10", "250"})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm.. it now takes 2+ hours for the whole benchmark to run locally.
I'm going to change to a single value of 20.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we reduce the number of records per invocation? Or pick different number of records depending on the benchmark case? Slower benchmarks probably don't need as much records per invocation as faster cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It takes long due to the number of combinations + benchmark setup, not due to the number of records:
- Warmup: 10 iterations, 10 s each
- Measurement: 10 iterations, 10 s each
- forks: 3
So it's 3 * 2 * 10 * 10 = 10m for one combination.
And here are benchmark parameters:
- upsert key: true / false
- state backend: rocksdb / heap
- retract delay: 1 / 1000
- payload size, retract percentage - fixed, so don't increase
meaning there 2^3=8 combinations in total.
So the total time is 8*10=80 minutes.
I don't see what combinations can we exclude, all seem important.
OTH, I think benchmark setup is probably redundant for this particular benchmark. I haven't notice significant difference when running with lower duration/forkCount/measurementCount. So I'd rather reduce those
For example to
- Warmup: 1 iterations, 5 s each
- Measurement: 5 iterations, 5 s each
- forks: 2
So it would be 2 * (5 + 5*5) * 8 = 8 minutes.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't touch the fork counts.
The recommended target for a single invocation is < 1s. Within a single iteration JMH works by invoking the benchmark method repeatedly for ~1s. If single invocation takes more than 1s, than iteration ends after the single invocation, which is usually not desired needed.
Given that here we are not spawning whole Flink job, on a mini cluster, I'm pretty sure we could reduce the invocation duration.
I guess maybe part of the problem is that your single invocation is also responsible for building up the state size, which arguably should/could be moved out to a setup method? You could:
- build up the desired state size/length of the values list in the setup method
- in invocation just run single process record call, or maybe single insertion and single retraction, after which the state size/list length remains unchanged?
That should be not only fine (as there would be no setup code in the benchmarked method), but that would be even by the book the intended way how to use JMH.
Each iteration would then take exactly 1s, so 1 minute per parameters combination, 8 minutes in total.
src/main/java/org/apache/flink/benchmark/SinkUpsertMaterializerBenchmark.java
Outdated
Show resolved
Hide resolved
pnowojski
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mostly LGTM % opened comments
src/main/java/org/apache/flink/benchmark/SinkUpsertMaterializerBenchmark.java
Outdated
Show resolved
Hide resolved
|
@flinkbot run azure |
Requires apache/flink#26876