-
Notifications
You must be signed in to change notification settings - Fork 4.8k
Bug 1808568: must-gather: when openshift-apiserver restarts, ignore wrong prefix #24641
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@sallyom: This pull request references Bugzilla bug 1808568, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@sallyom: No Bugzilla bug is referenced in the title of this pull request. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@sallyom: This pull request references Bugzilla bug 1808568, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/retest |
|
/test e2e-aws-serial |
|
/hold This is a symptom of the openshift-apiserver not correctly flushing its audit log to disk or must-gather not correctly retrieving content. Either way that this happened, do we know why it suddenly became a problem? |
From my investigation this was happening at the time when openshift-apiserver was restarting, but back then it wasn't such a big issue. Not sure why this e2e flake bubbled up recently this high. I agree that we should investigate the root cause, but at the same time, I'd prefer make this test less strict. We can make a separate BZ about openshift-apiserver that will need further investigation and make the test more resilient in parallel. @deads2k objections? |
soltysh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
I'll let @deads2k remove the hold if he agrees with my previous statement
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: sallyom, soltysh The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
I've opened https://bugzilla.redhat.com/show_bug.cgi?id=1811737 to track root cause |
Yeah, this exposed a critical data corruption problem with the audit log that we fixed with openshift/cluster-openshift-apiserver-operator#331 . Have you see it since then? |
|
/lgtm cancel |
|
this is no longer required, AFAICT, since openshift/cluster-openshift-apiserver-operator#331 resolved root cause |
/assign @soltysh
/cc @smarterclayton
example failure logs where this flake occurs: