*: support TLS and authentication for Thanos Ruler queries#1939
*: support TLS and authentication for Thanos Ruler queries#1939bwplotka merged 4 commits intothanos-io:masterfrom
Conversation
97b6f73 to
51cb94e
Compare
bwplotka
left a comment
There was a problem hiding this comment.
Nice!
Some comments, but generally 👍 👍 👍
Thanks!
cmd/thanos/rule.go
Outdated
| return err | ||
| } | ||
| c.Transport = tracing.HTTPTripperware(logger, c.Transport) | ||
| queryClient, err := http_util.NewFanoutClient(logger, cfg.EndpointsConfig, c, queryProvider.Clone()) |
There was a problem hiding this comment.
Fanout? I think the logic for rule evaluation might be different. It's fanout = broadcast approach.
There was a problem hiding this comment.
Yea, the logic is ok, just the Name of client might be confusing. It's not Fanout for Querier.
cmd/thanos/rule.go
Outdated
| } | ||
| // Each Alertmanager client has a different list of targets thus each needs its own DNS provider. | ||
| am, err := alert.NewAlertmanager(logger, cfg, amProvider.Clone()) | ||
| amClient, err := http_util.NewFanoutClient(logger, cfg.EndpointsConfig, c, amProvider.Clone()) |
There was a problem hiding this comment.
We can add tracing Tripperware as well I think
cmd/thanos/rule.go
Outdated
| // Discover and resolve query addresses. | ||
| { | ||
| for _, c := range queryClients { | ||
| addToGroup(g, c, dnsSDInterval) |
There was a problem hiding this comment.
Not consistent with alertmanager addToGroup placement. Let's decide on one approach if we can (:
Reason: I spent some time trying to understand if we missed addToGroup for query or not (:
There was a problem hiding this comment.
I've moved the call to addToGroup to the loop creating the query clients.
cmd/thanos/rule.go
Outdated
|
|
||
| queryConfig := extflag.RegisterPathOrContent(cmd, "query.config", "YAML file that contains query API servers configuration. See format details: https://thanos.io/components/rule.md/#configuration. If defined, it takes precedence over the '--query' and '--query.sd-files' flags.", false) | ||
|
|
||
| fileSDFiles := cmd.Flag("query.sd-files", "Path to file that contain addresses of query peers. The path can be a glob pattern (repeatable)."). |
There was a problem hiding this comment.
| fileSDFiles := cmd.Flag("query.sd-files", "Path to file that contain addresses of query peers. The path can be a glob pattern (repeatable)."). | |
| fileSDFiles := cmd.Flag("query.sd-files", "Path to file that contains addresses of query API servers. The path can be a glob pattern (repeatable)."). |
| // TODO(bwplotka): Propagate those to UI, probably requires changing rule manager code ): | ||
| level.Warn(logger).Log("warnings", strings.Join(warns, ", "), "query", q) | ||
| if err != nil { | ||
| level.Error(logger).Log("err", err, "query", q) |
There was a problem hiding this comment.
let's do continue instead of else - bit more readable
cmd/thanos/rule.go
Outdated
| return v, nil | ||
| } | ||
| } | ||
| return nil, errors.Errorf("no query peer reachable") |
There was a problem hiding this comment.
| return nil, errors.Errorf("no query peer reachable") | |
| return nil, errors.Errorf("no query API server reachable") |
The peer name is obsolete: Came from gossip times. (:
cmd/thanos/rule.go
Outdated
| } | ||
| } | ||
|
|
||
| func addToGroup(g *run.Group, c *http_util.FanoutClient, interval time.Duration) { |
There was a problem hiding this comment.
It's not only add, it adds for DNS & file discovery, so maybe:
| func addToGroup(g *run.Group, c *http_util.FanoutClient, interval time.Duration) { | |
| func addDiscoveryGroups(g *run.Group, c *http_util.FanoutClient, interval time.Duration) { |
| @@ -738,36 +719,49 @@ func queryFunc( | |||
| } | |||
|
|
|||
| return func(ctx context.Context, q string, t time.Time) (promql.Vector, error) { | |||
There was a problem hiding this comment.
I hope we can proper timeout set on caller top this (:
pkg/http/http.go
Outdated
| } | ||
|
|
||
| // FanoutClient represents a client that can send requests to a cluster of HTTP-based endpoints. | ||
| type FanoutClient struct { |
There was a problem hiding this comment.
Again, fanout might be bad wording here, why not just Client?
There was a problem hiding this comment.
I'm bad at naming 😄 I think I dismissedClient because it would be confusing with http.Client but it 's indeed better than the incorrect FanoutClient.
cmd/thanos/rule.go
Outdated
| span.Finish() | ||
| for _, i := range rand.Perm(len(queriers)) { | ||
| querier := queriers[i] | ||
| c := promclient.NewClient(logger, querier) |
There was a problem hiding this comment.
Why not creating this at the start of ruler? (:
ee16698 to
8c177a6
Compare
bwplotka
left a comment
There was a problem hiding this comment.
Thanks for doing this and addressing all comments! Super tiny nits and we can merge IMO 👍
ce17a0c to
d761ef6
Compare
|
Rebase needed ): |
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
d761ef6 to
2be9b4b
Compare
Closes #1778
Changes
Similar to #1838, this PR adds TLS and authentication support to Thanos Ruler for the query API endpoints. The YAML configuration format is very similar to the one used for configuring Alertmanager.
Verification
I have adapted the end-to-end tests to use the new parameters. The tests already exercised static addresses and file SD.