-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Labels
P3Low priorityLow priorityarea/controllerController issues, panicsController issues, panicsarea/gcGarbage collection, such as TTLs, retentionPolicy, delays, and moreGarbage collection, such as TTLs, retentionPolicy, delays, and moresolution/suggestedA solution to the bug has been suggested. Someone needs to implement it.A solution to the bug has been suggested. Someone needs to implement it.
Description
Pre-requisites
- I have double-checked my configuration
- I can confirm the issues exists when I tested with
:latest - I'd like to contribute the fix myself (see contributing guide)
What happened/what you expected to happen?
Submitted workflow which is expected to fail and it does, but the workflow gets deleted immediately.
Within seconds I get workflow gone.
I expected the failed workflows to remain as they did in earlier versions. I do not have the workflow-controller-configmap configured to do this. Relevant configuration:
completed: 400
spec:
activeDeadlineSeconds: 86400
podGC:
strategy: OnPodSuccessVersion
v3.5.0
Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
namespace: argo
generateName: hello-world-
labels:
workflows.argoproj.io/archive-strategy: "false"
annotations:
workflows.argoproj.io/description: |
This is a simple hello world example.
spec:
entrypoint: helloworld
templates:
- name: helloworld
container:
image: ghcr.io/openzipkin/alpine:3.18.2
command: ["/bin/sh"]
args: ["-c", "exit 1"]Logs from the workflow controller
kubectl logs -n argo deploy/workflow-controller | grep ${workflow}
time="2023-11-03T20:38:15.027Z" level=info msg="Processing workflow" namespace=argo workflow=hello-world-59xpp
time="2023-11-03T20:38:15.032Z" level=info msg="Updated phase -> Running" namespace=argo workflow=hello-world-59xpp
time="2023-11-03T20:38:15.033Z" level=warning msg="Node was nil, will be initialized as type Skipped" namespace=argo workflow=hello-world-59xpp
time="2023-11-03T20:38:15.033Z" level=info msg="was unable to obtain node for , letting display name to be nodeName" namespace=argo workflow=hello-world-59xpp
time="2023-11-03T20:38:15.033Z" level=info msg="Pod node hello-world-59xpp initialized Pending" namespace=argo workflow=hello-world-59xpp
time="2023-11-03T20:38:15.057Z" level=info msg="Created pod: hello-world-59xpp (hello-world-59xpp)" namespace=argo workflow=hello-world-59xpp
time="2023-11-03T20:38:15.057Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=hello-world-59xpp
time="2023-11-03T20:38:15.057Z" level=info msg=reconcileAgentPod namespace=argo workflow=hello-world-59xpp
time="2023-11-03T20:38:15.070Z" level=info msg="Workflow update successful" namespace=argo phase=Running resourceVersion=1703572159 workflow=hello-world-59xpp
time="2023-11-03T20:38:25.061Z" level=info msg="Processing workflow" namespace=argo workflow=hello-world-59xpp
time="2023-11-03T20:38:25.062Z" level=info msg="Task-result reconciliation" namespace=argo numObjs=0 workflow=hello-world-59xpp
time="2023-11-03T20:38:25.062Z" level=info msg="Pod failed: Error (exit code 1)" displayName=hello-world-59xpp namespace=argo pod=hello-world-59xpp templateName=helloworld workflow=hello-world-59xpp
time="2023-11-03T20:38:25.063Z" level=info msg="node changed" namespace=argo new.message="Error (exit code 1)" new.phase=Failed new.progress=0/1 nodeID=hello-world-59xpp old.message= old.phase=Pending old.progress=0/1 workflow=hello-world-59xpp
time="2023-11-03T20:38:25.063Z" level=info msg="TaskSet Reconciliation" namespace=argo workflow=hello-world-59xpp
time="2023-11-03T20:38:25.063Z" level=info msg=reconcileAgentPod namespace=argo workflow=hello-world-59xpp
time="2023-11-03T20:38:25.063Z" level=info msg="Updated phase Running -> Failed" namespace=argo workflow=hello-world-59xpp
time="2023-11-03T20:38:25.063Z" level=info msg="Updated message -> Error (exit code 1)" namespace=argo workflow=hello-world-59xpp
time="2023-11-03T20:38:25.063Z" level=info msg="Marking workflow completed" namespace=argo workflow=hello-world-59xpp
time="2023-11-03T20:38:25.068Z" level=info msg="cleaning up pod" action=deletePod key=argo/hello-world-59xpp-1340600742-agent/deletePod
time="2023-11-03T20:38:25.071Z" level=info msg="Workflow update successful" namespace=argo phase=Failed resourceVersion=1703572339 workflow=hello-world-59xpp
time="2023-11-03T20:38:25.073Z" level=info msg="Queueing Failed workflow argo/hello-world-59xpp for delete due to max rention(0 workflows)"
time="2023-11-03T20:38:25.073Z" level=info msg="Deleting garbage collected workflow 'argo/hello-world-59xpp'"
time="2023-11-03T20:38:25.081Z" level=info msg="cleaning up pod" action=labelPodCompleted key=argo/hello-world-59xpp/labelPodCompleted
time="2023-11-03T20:38:25.081Z" level=info msg="Successfully request 'argo/hello-world-59xpp' to be deleted"
Logs from in your workflow's wait container
kubectl logs -n argo -c wait -l workflows.argoproj.io/workflow=${workflow},workflow.argoproj.io/phase!=Succeeded
time="2023-11-03T20:38:19.525Z" level=info msg="Creating minio client using static credentials" endpoint=deap-api.decloud.xxx.com
time="2023-11-03T20:38:19.525Z" level=info msg="Saving file to s3" bucket=deap-argo-prod endpoint=deap-api.decloud.xxx.com key="dev\\ /2023\\ /11\\ /03\\ /hello-world-59xpp\\ /hello-world-59xpp\"/main.log" path=/tmp/argo/outputs/logs/main.log
time="2023-11-03T20:38:19.594Z" level=info msg="Save artifact" artifactName=main-logs duration=94.016454ms error="<nil>" key="dev\\ /2023\\ /11\\ /03\\ /hello-world-59xpp\\ /hello-world-59xpp\"/main.log"
time="2023-11-03T20:38:19.594Z" level=info msg="not deleting local artifact" localArtPath=/tmp/argo/outputs/logs/main.log
time="2023-11-03T20:38:19.594Z" level=info msg="Successfully saved file: /tmp/argo/outputs/logs/main.log"
time="2023-11-03T20:38:19.604Z" level=warning msg="failed to patch task set, falling back to legacy/insecure pod patch, see https://argoproj.github.io/argo-workflows/workflow-rbac/" error="workflowtaskresults.argoproj.io is forbidden: User \"system:serviceaccount:argo:default\" cannot create resource \"workflowtaskresults\" in API group \"argoproj.io\" in the namespace \"argo\""
time="2023-11-03T20:38:19.605Z" level=warning msg="Non-transient error: pods \"hello-world-59xpp\" is forbidden: User \"system:serviceaccount:argo:default\" cannot patch resource \"pods\" in API group \"\" in the namespace \"argo\""
time="2023-11-03T20:38:19.606Z" level=error msg="executor error: pods \"hello-world-59xpp\" is forbidden: User \"system:serviceaccount:argo:default\" cannot patch resource \"pods\" in API group \"\" in the namespace \"argo\""
time="2023-11-03T20:38:19.606Z" level=info msg="Alloc=12580 TotalAlloc=18686 Sys=30309 NumGC=4 Goroutines=10"
time="2023-11-03T20:38:19.606Z" level=fatal msg="pods \"hello-world-59xpp\" is forbidden: User \"system:serviceaccount:argo:default\" cannot patch resource \"pods\" in API group \"\" in the namespace \"argo\""
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
P3Low priorityLow priorityarea/controllerController issues, panicsController issues, panicsarea/gcGarbage collection, such as TTLs, retentionPolicy, delays, and moreGarbage collection, such as TTLs, retentionPolicy, delays, and moresolution/suggestedA solution to the bug has been suggested. Someone needs to implement it.A solution to the bug has been suggested. Someone needs to implement it.
