-
Notifications
You must be signed in to change notification settings - Fork 213
Bug 1903382: pkg/payload/task_graph: Require firstIncompleteNode to have tasks #484
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1903382: pkg/payload/task_graph: Require firstIncompleteNode to have tasks #484
Conversation
Our graph has some nodes without tasks, e.g. nodes which create a
choke point between two sets of parallel nodes. Ideally we give all
nodes descriptive names [1], but until we have that, require at least
one entry in Tasks before we store a node as firstIncompleteNode.
This avoids the chance of panicking during the subsequent [2]:
fmt.Errorf("%d incomplete task nodes, beginning with %s", incompleteCount, firstIncompleteNode.Tasks[0])
[1]: openshift#435
[2]: https://bugzilla.redhat.com/show_bug.cgi?id=1903382
|
@wking: This pull request references Bugzilla bug 1903382, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
| for i, result := range results { | ||
| if result == nil { | ||
| if firstIncompleteNode == nil { | ||
| if firstIncompleteNode == nil && len(graph.Nodes[i].Tasks) > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you do this, the check on line 548 needs to consider the case where firstIncompleteNode is nil and the incompleteCount is greater than zero.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line 548 already has a guard for firstIncompleteNode != nil. If incompleteCount is greater than zero, and firstIncompleteNode is nil, I'm not sure we care, since no meaningful work was left on the table. But I can check nextNode to make sure...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still not sure why the Task i.e. .Tasks[0] is nil.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but you don't produce the error in that case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still not sure why the Task i.e.
.Tasks[0]is nil.
We have empty placeholder nodes in the graph, like:
But I can check
nextNodeto make sure...
I'm pretty sure we're fine with the change I'm proposing, without needing to involve nextNode. Cases:
- All result-less nodes have no tasks. Then with my change I will never set
firstIncompleteNode, and we will not inject anincomplete task nodes, beginning with...error. But there were no outstanding tasks, so not injecting that error is an improvement. - The first result-less node has at least one task. Then that matches my guard, and we get exactly the same non-panicky
firstIncompleteNodehandling as we have in master today. - The first result-less node has no tasks, but some later result-less node has at least one task. Then that later node will match my guard, and we'll mention it with a
incomplete task nodes, beginning with...error instead of panicking (where we panic today).
|
Why is it preferred to change the selection of |
But without some of the node-naming work from #435, we can't say "the empty node between $GROUP_A and $GROUP_B was canceled", and have to say "some empty node was canceled; no idea how far we got through the graph or what real work was left uncovered". It seems more useful to try to step forward and figure out what the real work is that we're not getting to. |
|
@wking: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jottofar, wking The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
3 similar comments
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
@wking: All pull requests linked via external trackers have merged: Bugzilla bug 1903382 has been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/cherrypick release-4.6 |
|
@wking: new pull request created: #510 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |

Our graph has some nodes without tasks, e.g. nodes which create a choke point between two sets of parallel nodes. Ideally we give all nodes descriptive names (#435), but until we have that, require at least one entry in
Tasksbefore we store a node asfirstIncompleteNode.This avoids the chance of panicking during the subsequent: