Skip to content

Ready state handling of sidecar #1677

@jabbrwcky

Description

@jabbrwcky

We experienced some bumps upgrading to thanos 0.8.1 (docker image).

Looking into sidecar I noted that the ready state of the sidecar is only set in the initial loading of prometheus labels:

err := runutil.Retry(2*time.Second, ctx.Done(), func() error {
if err := m.UpdateLabels(ctx, logger); err != nil {
level.Warn(logger).Log(
"msg", "failed to fetch initial external labels. Is Prometheus running? Retrying",
"err", err,
)
promUp.Set(0)
statusProber.SetNotReady(err)
return err
}
level.Info(logger).Log(
"msg", "successfully loaded prometheus external labels",
"external_labels", m.Labels().String(),
)
promUp.Set(1)
statusProber.SetReady()
lastHeartbeat.Set(float64(time.Now().UnixNano()) / 1e9)
return nil
})

The recurring check updates the 'prometheus_up' metric, but not the sidecar ready state:

return runutil.Repeat(30*time.Second, ctx.Done(), func() error {
iterCtx, iterCancel := context.WithTimeout(context.Background(), 5*time.Second)
defer iterCancel()
if err := m.UpdateLabels(iterCtx, logger); err != nil {
level.Warn(logger).Log("msg", "heartbeat failed", "err", err)
promUp.Set(0)
} else {
promUp.Set(1)
lastHeartbeat.Set(float64(time.Now().UnixNano()) / 1e9)
}
return nil
})

Is this intentional?

I assume when prometheus is considered non-healthy/-ready, the sidecar should report the same.

Please correct me if I am wrong.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions