Skip to content

Conversation

@sallyom
Copy link
Contributor

@sallyom sallyom commented Dec 2, 2019

example #2 for using library-go/test/metrics functions. This PR adds an e2e to verify cluster-kube-scheduler-operator metrics are accessible.

see openshift/library-go#602

@openshift-ci-robot openshift-ci-robot added the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label Dec 2, 2019
@openshift-ci-robot
Copy link

@sallyom: This pull request references Bugzilla bug 1771810, which is invalid:

  • expected the bug to target the "4.4.0" release, but it targets "4.3.0" instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

Bug 1771810: Add e2e to verify kso metrics accessible

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Dec 2, 2019
@ingvagabund
Copy link
Member

This PR adds an e2e to verify cluster-kube-scheduler-operator metrics are accessible.

Is it required to access the metrics through kubectl? Or, is the goal of the PR to verify (using any method) the metrics are accessible? If so, would it be easier to access the metrics from the metrics endpoint? (I am just asking, I have never consumed the metrics other than directly from a component while running in the same binary).

@sallyom
Copy link
Contributor Author

sallyom commented Dec 2, 2019

This PR adds an e2e to verify cluster-kube-scheduler-operator metrics are accessible.

Is it required to access the metrics through kubectl? Or, is the goal of the PR to verify (using any method) the metrics are accessible? If so, would it be easier to access the metrics from the metrics endpoint? (I am just asking, I have never consumed the metrics other than directly from a component while running in the same binary).

The goal is to access metrics and also verify cluster CA -this will eventually be used to verify successful cert rotation- what I landed upon (after days of trial and error) is to access metrics via internal service url with curl cmd secured via cluster CA (generated by service-ca-operator/annotated configmap). I also had a version using the external route, and go code instead of curl - but I could not utilize service CA - only way that worked was to set InsecureSkipVerify in TLSconfig - but exec-ing in a pod using internal url works perfectly. I'm open to using anything that works, though!!

@sallyom
Copy link
Contributor Author

sallyom commented Dec 2, 2019

/test e2e-aws-operator

@ingvagabund
Copy link
Member

I'm open to using anything that works, though!!

I have zero experience in this area. Though, as long as it's used in tests, I have no objection. Are you aware of other components/operator that test the same? Or, are you pioneering it right now? :)

@s-urbaniak
Copy link
Contributor

@sallyom @ingvagabund hello from the monitoring team! :-) 👋 @paulfantom asked me to help out here, so don't hesitate also to ping me on #forum-monitoring in slack.

I do recommend that you access your metrics via the exposed Thanos routes (which internally forward to Prometheus). That way you will verify that your metrics are being successfully scraped and hence also verifies the RBAC setup towards your scraping targets. I would not advise to check metrics in your local metrics endpoint only.

Regarding querying metrics you could do the approach done here or create your own http client (as we did) after finding out the external route and token: https://github.com/openshift/cluster-monitoring-operator/blob/8e5623597b94d2f62c38bc427c23a820a01f002f/test/e2e/framework/route_client.go#L47-L76

@sallyom
Copy link
Contributor Author

sallyom commented Dec 3, 2019

Regarding querying metrics you could do the approach done here or create your own http client (as we did) after finding out the external route and token: https://github.com/openshift/cluster-monitoring-operator/blob/8e5623597b94d2f62c38bc427c23a820a01f002f/test/e2e/framework/route_client.go#L47-L76

thanks @s-urbaniak yes, I was under impression that a TLSconfig secured via CAbundle was required, but as it turns out it is not - so, I am going w/ the external prom url /route client - i had that working a few days ago but only w/ InsecureSkipVerify - and I thought we needed to avoid that but alas after much discussion we do not - so I'm back to using the route (very similar to the route_client setup you pointed to). That will live in library-go/test/library/metrics/query.go.

@s-urbaniak
Copy link
Contributor

@sallyom sounds good, yes we are also "InsecureSkipVerify'ing" the TLS connection via the exposed routes right now.

@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 19, 2019
@openshift-ci-robot openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 24, 2019
@sallyom sallyom force-pushed the bz1771810 branch 2 times, most recently from 2c95a69 to b7e0a6a Compare January 6, 2020 23:28
@openshift-ci-robot openshift-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jan 6, 2020
@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 7, 2020
@openshift-ci-robot openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 7, 2020
@sallyom sallyom force-pushed the bz1771810 branch 2 times, most recently from 089811d to aaa6d16 Compare January 7, 2020 16:51
@sallyom
Copy link
Contributor Author

sallyom commented Jan 7, 2020

/bugzilla refresh

@openshift-ci-robot openshift-ci-robot added bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. and removed bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. labels Jan 7, 2020
@openshift-ci-robot
Copy link

@sallyom: This pull request references Bugzilla bug 1771810, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sallyom sallyom force-pushed the bz1771810 branch 2 times, most recently from 9a76910 to 02f6c18 Compare January 13, 2020 19:52
@sallyom
Copy link
Contributor Author

sallyom commented Jan 14, 2020

/retest

Copy link
Contributor

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve
/hold
for nit resolution

Makefile Outdated
$(call add-bindata,v4.1.0,./bindata/v4.1.0/...,bindata,v410_00_assets,pkg/operator/v410_00_assets/bindata.go)

e2e: GO_TEST_PACKAGES :=./test/e2e
e2e: GO_TEST_FLAGS += -v
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need that verbosity, if the test fails, we'll get it, otherwise it's not needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed it

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 16, 2020
@openshift-ci-robot openshift-ci-robot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jan 16, 2020
}

// do something with result, eventually this will be tested b4 and after rotation to ensure values differ
fmt.Printf("result from prometheus query `scheduler_scheduling_duration_seconds_sum`: %v", response)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe change this to t.Logf

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Jan 20, 2020
@sallyom
Copy link
Contributor Author

sallyom commented Jan 29, 2020

/retest

Copy link
Contributor

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/hold cancel

@openshift-ci-robot openshift-ci-robot added lgtm Indicates that a PR is ready to be merged. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Jan 31, 2020
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sallyom, soltysh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 25e7a32 into openshift:master Jan 31, 2020
@openshift-ci-robot
Copy link

@sallyom: All pull requests linked via external trackers have merged. Bugzilla bug 1771810 has been moved to the MODIFIED state.

Details

In response to this:

Bug 1771810: Add e2e to verify kso metrics accessible

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants