store/cache: fix broken metric and current index cache size handling#873
Merged
bwplotka merged 5 commits intothanos-io:masterfrom Mar 4, 2019
Merged
store/cache: fix broken metric and current index cache size handling#873bwplotka merged 5 commits intothanos-io:masterfrom
bwplotka merged 5 commits intothanos-io:masterfrom
Conversation
Adding uint64(len(b)) to c.curSize might potentially overflow uint64 if the numbers are big enough and then we might not remove enough items from the LRU to satisfy the request. On the other hand, switching the operands avoids this problem because we check before if uint64(len(b)) is bigger than c.maxSize so subtracting uint64(len(b)) will *never* overflow because we know that it is less or equal to c.maxSize.
adrien-f
reviewed
Feb 28, 2019
pkg/store/cache.go
Outdated
| for c.curSize+uint64(len(b)) > c.maxSize { | ||
| c.lru.RemoveOldest() | ||
| for c.curSize > c.maxSize-uint64(len(b)) { | ||
| if _, val, ok := c.lru.RemoveOldest(); ok { |
Member
There was a problem hiding this comment.
Should we handle something if that does not ok ?
Member
Author
There was a problem hiding this comment.
How we could inform the users about such case? AFAICT there is no way ok can be false due to the check before-hand.
bwplotka
reviewed
Mar 1, 2019
Member
bwplotka
left a comment
There was a problem hiding this comment.
Hm . Isn't currSize updated in "onEvict"?
Member
Author
|
Ah yes, that's true but it is never increased. Will revert fbaae7c. |
c.curSize is lowered in onEvict.
Add smoke tests for the index cache which check if we set curSize properly, and if removal works.
Member
|
2 flakes on tests, but now is green: https://circleci.com/gh/improbable-eng/thanos/2353 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Potpourri of fixes to the cache:
Do not forget to increase the c.current metric with relevant labels when
adding items to the cache. Without this the metric
thanos_store_index_cache_itemsis unavailable.Properly adjust c.curSize when adding items to the cache.
Switch the order of operands in a check to avoid a potential uint64 overflow and thus not removing enough items to satisfy our size constraints.
Without these last two fixes essentially the metric
thanos_store_index_cache_items_evicted_totalis always zero together withthanos_store_index_cache_items_overflowed_total(most of the time). This is because we never enter the loop, and we never increase c.curSize before this.Verification: