Skip to content

Conversation

@anishshri-db
Copy link
Contributor

@anishshri-db anishshri-db commented Apr 3, 2024

What changes were proposed in this pull request?

Add microbenchmark for merge operations for multiple values in value portion of state store

Why are the changes needed?

Micro-benchmark to understand performance with/without rows tracking around merge operations

As shown in the results, merge without tracking is consistently 3x faster

merging 10000 rows with 10 values per key (10000 rows to overwrite - rate 100):  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
--------------------------------------------------------------------------------------------------------------------------------------------------------------
RocksDB (trackTotalNumberOfRows: true)                                                    519            533           7          0.0       51916.6       1.0X
RocksDB (trackTotalNumberOfRows: false)                                                   171            177           3          0.1       17083.9       3.0X

GH Actions here:

Difference is even more running locally (> 7x faster without tracking)

Does this PR introduce any user-facing change?

No

How was this patch tested?

Test only change

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the SQL label Apr 3, 2024
@anishshri-db anishshri-db changed the title [SPARK-47310] Add microbenchmark for merge operations for multiple values in value portion of state store [SPARK-47310][SS] Add microbenchmark for merge operations for multiple values in value portion of state store Apr 3, 2024
@anishshri-db anishshri-db marked this pull request as draft April 3, 2024 23:51
@anishshri-db anishshri-db marked this pull request as ready for review April 4, 2024 21:29
@anishshri-db
Copy link
Contributor Author

@sahnib @HeartSaVioR - PTAL, thx !

@anishshri-db anishshri-db changed the title [SPARK-47310][SS] Add microbenchmark for merge operations for multiple values in value portion of state store [SPARK-47310][SS] Add micro-benchmark for merge operations for multiple values in value portion of state store Apr 4, 2024
Copy link
Contributor

@sahnib sahnib left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for adding this.

Copy link
Contributor

@HeartSaVioR HeartSaVioR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@HeartSaVioR
Copy link
Contributor

Thanks! Merging to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants