receive: Add liveness and readiness probe#1537
Conversation
37839f0 to
4d3d4ef
Compare
|
cc @FUSAKLA |
cmd/thanos/receive.go
Outdated
| s := newStoreGRPCServer(logger, reg, tracer, tsdbStore, opts) | ||
|
|
||
| level.Info(logger).Log("msg", "listening for StoreAPI gRPC", "address", grpcBindAddr) | ||
| statusProber.SetReady() |
There was a problem hiding this comment.
I think the receiver should probably not be ready until the TSDB is ready? Also not sure about the hashring and the receive interface is also not guarantied to be up at this point.
Maybe this will require some more complex condition for the ready state 🤔
There was a problem hiding this comment.
@FUSAKLA For the TSDB, it's ready at this stage if you check line 270. It runs after TSDB is open.
For receive interface, I thought if something goes south it'll change liveness state so, readiness won't be needed.
I guess I need to double-check the hashring readiness.
I'll have another look at it.
105037a to
43c5a8a
Compare
Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>
Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>
Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>
Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>
Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>
43c5a8a to
c4279bb
Compare
| s := newStoreGRPCServer(logger, reg, tracer, tsdbStore, opts) | ||
|
|
||
| // Wait hashring to be ready before start serving metrics | ||
| <-hashringReady |
There was a problem hiding this comment.
Why are we waiting for the hashring to be ready before serving metrics from the store? These things are entirely independent IMO
* Add prober to receive Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Add changelog entries Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Update README Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Remove default Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Wait hashring to be ready Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com>
* Add prober to receive Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Add changelog entries Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Update README Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Remove default Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Wait hashring to be ready Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me>
* Some updates to compact docs Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * some formatting Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * Update docs/components/compact.md accept PR suggestions Co-Authored-By: Bartlomiej Plotka <bwplotka@gmail.com> Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * Add metalmatze to list of maintainers (#1547) Signed-off-by: Matthias Loibl <mail@matthiasloibl.com> Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * resolve comments Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * resolve last comment Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * receive: Add liveness and readiness probe (#1537) * Add prober to receive Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Add changelog entries Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Update README Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Remove default Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Wait hashring to be ready Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * downsample: Add liveness and readiness probe (#1540) * Add readiness and liveness probes for downsampler Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Add changelog entry Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Remove default Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Set ready Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Update CHANGELOG Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Clean CHANGELOG Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * Document the dnssrvnoa option (#1551) Signed-off-by: Antonio Santos <antonio@santosvelasco.com> Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * feat store: added readiness and livenes prober (#1460) Signed-off-by: Martin Chodur <m.chodur@seznam.cz> Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * Add Hotstar to adopters. (#1553) It's the largest streaming service in India that does cricket and GoT for India. They have insane scale and are using Thanos to scale their Prometheus. Spoke to them offline about adding the logo and will get a signoff here too. Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com> Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * Fix hotstar logo in the adoptor's list (#1558) Signed-off-by: Karthik Vijayaraju <karthik@hotstar.com> Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * Fix typos, including 'fomrat' -> 'format' in tracing.config-file help text. (#1552) Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * Compactor: Fix for #844 - Ignore object if it is the current directory (#1544) * Ignore object if it is the current directory Signed-off-by: Jamie Poole <jimbobby5@yahoo.com> * Add full-stop Signed-off-by: Jamie Poole <jimbobby5@yahoo.com> Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * Adding doc explaining the importance of groups for compactor (#1555) Signed-off-by: Leo Meira Vital <leo.vital@nubank.com.br> Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * Add blank line for list (#1566) The format of these files is wrong in the web. Signed-off-by: dongwenjuan <dong.wenjuan@zte.com.cn> Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * Refactor compactor constants, fix bucket column (#1561) * compact: unify different time constants Use downsample.* constants where possible. Move the downsampling time ranges into constants and use them as well. Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com> * bucket: refactor column calculation into compact Fix the column's name and name it UNTIL-DOWN because that is what it actually shows - time until the next downsampling. Move out the calculation into a separate function into the compact package. Ideally we could use the retention policies in this calculation as well but the `bucket` subcommand knows nothing about them :-( Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com> * compact: fix issues with naming Reorder the constants and fix mistakes. Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com> Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * remove duplicate Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me>
* Add prober to receive Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Add changelog entries Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Update README Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Remove default Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Wait hashring to be ready Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>
* Some updates to compact docs Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * some formatting Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * Update docs/components/compact.md accept PR suggestions Co-Authored-By: Bartlomiej Plotka <bwplotka@gmail.com> Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * Add metalmatze to list of maintainers (#1547) Signed-off-by: Matthias Loibl <mail@matthiasloibl.com> Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * resolve comments Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * resolve last comment Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * receive: Add liveness and readiness probe (#1537) * Add prober to receive Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Add changelog entries Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Update README Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Remove default Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Wait hashring to be ready Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * downsample: Add liveness and readiness probe (#1540) * Add readiness and liveness probes for downsampler Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Add changelog entry Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Remove default Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Set ready Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Update CHANGELOG Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> * Clean CHANGELOG Signed-off-by: Kemal Akkoyun <kakkoyun@gmail.com> Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * Document the dnssrvnoa option (#1551) Signed-off-by: Antonio Santos <antonio@santosvelasco.com> Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * feat store: added readiness and livenes prober (#1460) Signed-off-by: Martin Chodur <m.chodur@seznam.cz> Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * Add Hotstar to adopters. (#1553) It's the largest streaming service in India that does cricket and GoT for India. They have insane scale and are using Thanos to scale their Prometheus. Spoke to them offline about adding the logo and will get a signoff here too. Signed-off-by: Goutham Veeramachaneni <gouthamve@gmail.com> Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * Fix hotstar logo in the adoptor's list (#1558) Signed-off-by: Karthik Vijayaraju <karthik@hotstar.com> Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * Fix typos, including 'fomrat' -> 'format' in tracing.config-file help text. (#1552) Signed-off-by: Callum Styan <callumstyan@gmail.com> Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * Compactor: Fix for #844 - Ignore object if it is the current directory (#1544) * Ignore object if it is the current directory Signed-off-by: Jamie Poole <jimbobby5@yahoo.com> * Add full-stop Signed-off-by: Jamie Poole <jimbobby5@yahoo.com> Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * Adding doc explaining the importance of groups for compactor (#1555) Signed-off-by: Leo Meira Vital <leo.vital@nubank.com.br> Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * Add blank line for list (#1566) The format of these files is wrong in the web. Signed-off-by: dongwenjuan <dong.wenjuan@zte.com.cn> Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * Refactor compactor constants, fix bucket column (#1561) * compact: unify different time constants Use downsample.* constants where possible. Move the downsampling time ranges into constants and use them as well. Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com> * bucket: refactor column calculation into compact Fix the column's name and name it UNTIL-DOWN because that is what it actually shows - time until the next downsampling. Move out the calculation into a separate function into the compact package. Ideally we could use the retention policies in this calculation as well but the `bucket` subcommand knows nothing about them :-( Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com> * compact: fix issues with naming Reorder the constants and fix mistakes. Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com> Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> * remove duplicate Signed-off-by: Ivan Kiselev <kiselev_ivan@pm.me> Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>
This PR,
/-/healthyendpoint for liveness checks./-/readyendpoint for readiness checks.Changes
/-/healthyendpoint for liveness checks./-/readyendpoint for readiness checks.prober.Proberfor readiness and liveness endpoints.Verification
make testStarted
thanos receiveand made a request to related endpoints.