store the first raw value of a chunk during downsampling#1709
store the first raw value of a chunk during downsampling#1709bwplotka merged 5 commits intothanos-io:masterfrom
Conversation
As discussed in thanos-io#1568, storing only the last raw value of a chunk will lose a counter reset when: a) the reset occurs at a chunk boundary, and b) the last raw value of the earlier chunk is less than the first aggregated value of the later chunk. This commit stores the first raw value of a chunk during the initial raw aggregation, and retains it during subsequent aggregations. This is similar to the existing handling for the last raw value of a chunk. With this change, when counterSeriesIterator iterates over a chunk boundary, it will see the last raw value of the earlier chunk, then the first raw value of the later chunk, and then the first aggregated value of the later chunk. The first raw value will always be less than or equal to the first aggregated value, so the only difference in counterSeriesIterator's output will be the possible detection of a reset and an extra sample after the chunk boundary. Fixes: thanos-io#1568 Signed-off-by: Alfred Landrum <alfred@leakybucket.org>
Signed-off-by: Alfred Landrum <alfred@leakybucket.org>
Signed-off-by: Alfred Landrum <alfred@leakybucket.org>
bwplotka
left a comment
There was a problem hiding this comment.
Nice! I don't have a way to really e2e test this case, but I read through the algorithm, and it makes sense to me 👍 Thanks! Small nit only. And thanks for awesome explanations on both issue and PR!
Just curious, did you also observed that in your real system in the actual query? (: if yes, did Thanos with this PR returns expected result?
Small style nit only from my side.
@brian-brazil could you take a look as well? (:
pkg/compact/downsample/downsample.go
Outdated
| } | ||
|
|
||
| func (b *aggrChunkBuilder) finalizeChunk(lastT int64, trueSample float64) { | ||
| func (b *aggrChunkBuilder) firstRawSample(firstT int64, trueSample float64) { |
There was a problem hiding this comment.
those functions are quite shallow, and really the same. Can we maybe just inline with the comment? We use it twice, sure, but if we would inline them it might be even clearer?
There was a problem hiding this comment.
Maybe but IMHO this split up is clear too since literally the function's name tells you what's happening 😄 up to you.
There was a problem hiding this comment.
I've handled this in the latest diff by inlining the actions of the functions, but pointing them to new explanatory comments at CounterSeriesIterator, please take a look.
|
@bwplotka : regarding your question: I didn't observe this directly: my colleague @aponjavic spotted the potential issue as we were studying how Thanos implements downsampling. So I don't have an easy setup that repros the issue & the fix. |
GiedriusS
left a comment
There was a problem hiding this comment.
Maybe we should also update the comment around the type CounterSeriesIterator?
Signed-off-by: Alfred Landrum <alfred@leakybucket.org>
bwplotka
left a comment
There was a problem hiding this comment.
LGTM, Thanks!
BTW great talk on PromCon (: We might want to link slides in downsampling doc even.
) * store the first raw value of a chunk during downsampling As discussed in thanos-io#1568, storing only the last raw value of a chunk will lose a counter reset when: a) the reset occurs at a chunk boundary, and b) the last raw value of the earlier chunk is less than the first aggregated value of the later chunk. This commit stores the first raw value of a chunk during the initial raw aggregation, and retains it during subsequent aggregations. This is similar to the existing handling for the last raw value of a chunk. With this change, when counterSeriesIterator iterates over a chunk boundary, it will see the last raw value of the earlier chunk, then the first raw value of the later chunk, and then the first aggregated value of the later chunk. The first raw value will always be less than or equal to the first aggregated value, so the only difference in counterSeriesIterator's output will be the possible detection of a reset and an extra sample after the chunk boundary. Fixes: thanos-io#1568 Signed-off-by: Alfred Landrum <alfred@leakybucket.org> * changelog for thanos-io#1709 Signed-off-by: Alfred Landrum <alfred@leakybucket.org> * adjust existing downsampling tests Signed-off-by: Alfred Landrum <alfred@leakybucket.org> * add counter aggregation comments to CounterSeriesIterator Signed-off-by: Alfred Landrum <alfred@leakybucket.org> Signed-off-by: Aleksey Sin <asin@ozon.ru>
As discussed in #1568, storing only the last raw value
of a chunk will lose a counter reset when:
a) the reset occurs at a chunk boundary, and
b) the last raw value of the earlier chunk is less than
the first aggregated value of the later chunk.
This commit stores the first raw value of a chunk during
the initial raw aggregation, and retains it during
subsequent aggregations. This is similar to the existing
handling for the last raw value of a chunk.
With this change, when counterSeriesIterator iterates over
a chunk boundary, it will see the last raw value of the
earlier chunk, then the first raw value of the later chunk,
and then the first aggregated value of the later chunk. The
first raw value will always be less than or equal to the
first aggregated value, so the only difference in
counterSeriesIterator's output will be the possible detection
of a reset and an extra sample after the chunk boundary.
Fixes: #1568
Signed-off-by: Alfred Landrum alfred@leakybucket.org
Changes
Verification