store/proxy: Deduplicate chunks on StoreAPI level. Recommend chunk sorting for StoreAPI.#2603
Conversation
973ad6a to
8717698
Compare
715a3ed to
4f3f53d
Compare
|
As expected there is some overhead of this. 3x CPU 1.5x Mem allocs. Overhead was expected but maybe not that significant. NOTE: Dataset does not have ANY duplicates, so we do in this function MORE, to do later, less (: |
|
I think the strategy seems sane, but could we add a couple of benchmarks that prove the theory that this does indeed make the typical setup faster? |
…ng for StoreAPI. Also: Merge same series together on proxy level instead select. This allows better dedup efficiency. Partially fixes: #2303 Cases like overlapped data from store and sidecar and 1:1 duplicates are optimized as soon as it's possible. This case was highly visible on GitLab repro data and exists in most of Thanos setup. Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
4f3f53d to
da3c83e
Compare
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
da3c83e to
f6004ab
Compare
|
Ok I optimized a lot and I think it's good to go. I can think of more optimizations, but it looks good. NOTE: I just noticed this is the full overlapping benchmark. So it means 100% chunks in those "blocks" are overlapping. That's why comparison was so crucial (: This is normally a rare operation though. I extended tests with non overlap as well. Also this version is with |
34d9706 to
0ebf5e1
Compare
|
Ok last version uses LESS resources and is FASTER and adds MORE features =D This is compared to the code WITHOUT this PR. Considering this done! |
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
0ebf5e1 to
b6579bb
Compare
|
We got hit by my chaining practice again - it was not merged to master after all ): We need some bot helping or I can try modifying titles or something |


Blocked by: #2548
Partially fixes: #2303
It should make certain calls where we have overlapped data from StoreAPIs much faster and more efficient.
Changes:
select. This allows for better dedup efficiency.Signed-off-by: Bartlomiej Plotka bwplotka@gmail.com