pkg/gate: Prefix gate metrics for selects#3154
Conversation
722181c to
9b9d35e
Compare
kakkoyun
left a comment
There was a problem hiding this comment.
Thanks for the contributions. Looking good.
I have a couple of comments.
cmd/thanos/query.go
Outdated
| engine = promql.NewEngine( | ||
| queryableCreator = query.NewQueryableCreator( | ||
| logger, | ||
| extprom.WrapRegistererWithPrefix("thanos_query_concurrent_selects_", reg), |
There was a problem hiding this comment.
Right now, this registry only used for concurrent selects. However, it could be used for others as well. I think we should wrap the registry this early.
There was a problem hiding this comment.
Good point. I'll think about it.
fff2500 to
430b3ed
Compare
Thanos query exposed `gate_queries_in_flight` and `gate_duration_seconds_bucket` metrics for concurrent selects. These metrics are now prefixed by `thanos_query_concurrent_selects_`. Signed-off-by: Simon Pasquier <spasquie@redhat.com>
This change deprecates the gate.(*Keeper) struct. When Keeper is used to create several gates, the metric tracking the number of in-flight metric isn't meaningful because it is hard to say whether requests are being blocked or not. As such the `thanos_query_concurrent_selects_gate_queries_in_flight` is removed. The following metrics have been added to record the maximum number of concurrent requests per gate: * `thanos_query_gate_queries_max` * `thanos_bucket_store_series_gate_queries_max`, previously known as `thanos_bucket_store_queries_concurrent_max.` * `thanos_memcached_getmulti_gate_queries_max` Signed-off-by: Simon Pasquier <spasquie@redhat.com>
430b3ed to
0a6b36a
Compare
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
| // instead. Otherwise it is recommended to use the | ||
| // github.com/prometheus/prometheus/pkg/gate package directly and wrap the | ||
| // returned gate with gate.InstrumentGateDuration(). | ||
| type Keeper struct { |
There was a problem hiding this comment.
Is this still used anywhere in the codebase? If it's not the case, why not remove it directly? I don't think we have any downstream dependents.
There was a problem hiding this comment.
unfortunately we have, I should have pointed out in the PR.
https://github.com/cortexproject/cortex/blob/f1b95c68d3af4f715b11bd33c53ebaf83260f3bb/pkg/storegateway/bucket_stores.go#L72
There was a problem hiding this comment.
and given that Thanos imports Cortex too, it creates a circular dependency hence the need to keep Keeper :)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Changes
The metrics on the gates for concurrent selects were exposed without a prefix (e.g.
gate_queries_in_flightandgate_duration_seconds) unlike the gate metrics for the query API. The first commit 60950ff adds thethanos_query_concurrent_selects_prefix.After more thinking, I felt that exposing the number of in-flight queries doesn't make much sense for concurrent selects because each query creates a new gate. Thus we can have (max number of concurrent selects per query x max number of concurrent queries) selects in flight and it doesn't tell whether a query is blocked on too many selects or not. I've added another commit
722181c that removes the
thanos_query_concurrent_selects_gate_queries_in_flightmetric.The following metrics have also been added to record the maximum number of concurrent requests per gate:
thanos_query_gate_queries_maxthanos_bucket_store_series_gate_queries_max, previously known asthanos_bucket_store_queries_concurrent_max.thanos_memcached_getmulti_gate_queries_maxVerification
Tested manually.