compact: add metric thanos_compactor_iterations_total#1733
Merged
bwplotka merged 5 commits intothanos-io:masterfrom Nov 13, 2019
Merged
compact: add metric thanos_compactor_iterations_total#1733bwplotka merged 5 commits intothanos-io:masterfrom
bwplotka merged 5 commits intothanos-io:masterfrom
Conversation
Add a metric called thanos_compactor_iterations_total that is a counter and will get increased by 1 every time an iteration gets executed successfully. This is needed in case --wait is specified and then our Compactor could die. We need to alert on such a case. One thing would be to alert on a restart of the container however that is not the most flexible thing - it might still be OK as long as it successfully finishes its job in time. However, it is impossible to know that exact part ATM. Add this metric so that users could add alerts like: ``` rate(thanos_compactor_iterations_total[1d]) == 0 FOR 3d ``` Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>
Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>
Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>
a18e0b3 to
8a9a90d
Compare
Let's register the metric no matter what since if it is run as a batch job then this metric does not matter either way. Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>
FUSAKLA
approved these changes
Nov 8, 2019
Member
FUSAKLA
left a comment
There was a problem hiding this comment.
Nice 👍
we should update the alerts at some point again maybe
squat
approved these changes
Nov 8, 2019
CHANGELOG.md
Outdated
| - [#1573](https://github.com/thanos-io/thanos/pull/1573) `AliYun OSS` object storage, see [documents](docs/storage.md#aliyun-oss) for further information. | ||
| - [#1680](https://github.com/thanos-io/thanos/pull/1680) Add a new `--http-grace-period` CLI option to components which serve HTTP to set how long to wait until HTTP Server shuts down. | ||
| - [#1712](https://github.com/thanos-io/thanos/pull/1712) Rename flag on bucket web component from `--listen` to `--http-address` to match other components. | ||
| - [#1733](https://github.com/thanos-io/thanos/pull/1733) New metric `thanos_compactor_iterations_total` on Thanos Compactor which shows the number of successful iterations |
Member
There was a problem hiding this comment.
Tiny nit: add a period at the end here to keep them all full sentences.
Add a period at the end of an item in the CHANGELOG to keep it uniform. Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>
IKSIN
pushed a commit
to monitoring-tools/thanos
that referenced
this pull request
Nov 26, 2019
* compact: add metric thanos_compactor_iterations_total Add a metric called thanos_compactor_iterations_total that is a counter and will get increased by 1 every time an iteration gets executed successfully. This is needed in case --wait is specified and then our Compactor could die. We need to alert on such a case. One thing would be to alert on a restart of the container however that is not the most flexible thing - it might still be OK as long as it successfully finishes its job in time. However, it is impossible to know that exact part ATM. Add this metric so that users could add alerts like: ``` rate(thanos_compactor_iterations_total[1d]) == 0 FOR 3d ``` Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com> * CHANGELOG: add entry Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com> * compact: simplify wait check Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com> * cmd: thanos: compact: remove wait check Let's register the metric no matter what since if it is run as a batch job then this metric does not matter either way. Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com> * CHANGELOG: add period Add a period at the end of an item in the CHANGELOG to keep it uniform. Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com> Signed-off-by: Aleksey Sin <asin@ozon.ru>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add a metric called thanos_compactor_iterations_total that is a counter
and will get increased by 1 every time an iteration gets executed
successfully. This is needed in case --wait is specified and then our
Compactor could die. We need to alert on such a case.
One thing would be to alert on a restart of the container however that
is not the most flexible thing - it might still be OK as long as it
successfully finishes its job in time. However, it is impossible to know
that exact part ATM.
Add this metric so that users could add alerts like:
Signed-off-by: Giedrius Statkevičius giedriuswork@gmail.com
There is
thanos_objstore_bucket_last_successful_upload_timebut stillthat doesn't show if we were able to get through all of the cycle successfully.