Skip to content

Conversation

@kaovilai
Copy link
Collaborator

@kaovilai kaovilai commented Apr 25, 2025

Design for #8869

Signed-off-by: Tiger Kaovilai [email protected]

Thank you for contributing to Velero!

Please add a summary of your change

Architecture Diagram

graph TB
    subgraph "User Configuration"
        CLI[velero install CLI]
        CLI -->|--server-priority-class-name| ServerFlag[Server Priority Class]
        CLI -->|--node-agent-priority-class-name| NodeAgentFlag[Node Agent Priority Class]
    end

    subgraph "Velero Components"
        ServerFlag --> Deployment[Velero Server Deployment]
        NodeAgentFlag --> DaemonSet[Node Agent DaemonSet]
        
        Deployment -->|stores config| ConfigMap1[velero-config ConfigMap]
        DaemonSet -->|stores config| ConfigMap2[node-agent-config ConfigMap]
    end

    subgraph "Data Movement Pods"
        ConfigMap2 -->|reads priority class| DataUpload[Data Upload Controller]
        ConfigMap2 -->|reads priority class| DataDownload[Data Download Controller]
        DataUpload --> DataMoverPod1[Data Mover Pods]
        DataDownload --> DataMoverPod2[Data Mover Pods]
    end

    subgraph "Maintenance Jobs"
        ConfigMap1 -->|global config only| MaintenanceConfig[Maintenance Job Config]
        MaintenanceConfig --> MaintenanceJob[Repository Maintenance Jobs]
    end

    subgraph "CSI Snapshot Operations"
        ConfigMap2 -->|reads priority class| CSISnapshot[CSI Snapshot Exposer]
        ConfigMap2 -->|reads priority class| GenericRestore[Generic Restore Exposer]
        CSISnapshot --> ExposeePod1[Exposee Pods]
        GenericRestore --> ExposeePod2[Restore Pods]
    end

    style CLI fill:#e1f5fe
    style Deployment fill:#81c784
    style DaemonSet fill:#81c784
    style DataMoverPod1 fill:#ffb74d
    style DataMoverPod2 fill:#ffb74d
    style MaintenanceJob fill:#ce93d8
    style ExposeePod1 fill:#ffb74d
    style ExposeePod2 fill:#ffb74d
Loading

Does your change fix a particular issue?

Implementation: #8883

Please indicate you've done the following:

@github-actions github-actions bot added the Area/Design Design Documents label Apr 25, 2025
@kaovilai kaovilai mentioned this pull request Apr 25, 2025
6 tasks
@kaovilai
Copy link
Collaborator Author

/kind changelog-not-required

@github-actions github-actions bot added the kind/changelog-not-required PR does not require a user changelog. Often for docs, website, or build changes label Apr 25, 2025

1. The priority class name is set on the Velero deployment and node agent daemonset during installation.
2. For maintenance jobs, the priority class name is inherited from the Velero server deployment through the `GetPriorityClassNameFromVeleroServer` function in the `veleroutil` package.
3. For data mover pods created by the exposer, the priority class name is inherited from the node agent daemonset through the `getInheritedPodInfo` function, which calls `nodeagent.GetPodSpec` to get the pod spec from the node agent daemonset.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Data movers should be with low priority in most cases, in another word, when resource is limited, data movers should not run.

If we decide to inherit from node-agent, we should make sure the priority class configuration is separate for node-agent and Velero server. cc @ywk253100

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to have a low default? Or leave as k8s default but allow user to set themselves?

@codecov
Copy link

codecov bot commented May 13, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 60.05%. Comparing base (8ce513c) to head (ea20550).
Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8882      +/-   ##
==========================================
+ Coverage   60.03%   60.05%   +0.01%     
==========================================
  Files         379      379              
  Lines       43496    43496              
==========================================
+ Hits        26114    26122       +8     
+ Misses      15814    15807       -7     
+ Partials     1568     1567       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

"Priority class name for the node agent daemonset. Optional.",
)

flags.StringVar(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems we don't need this because we'll define the priority class in the repo maintenance configmap?

Copy link
Collaborator Author

@kaovilai kaovilai Jun 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The velero install can create configmap containing maintenance priority class for you. WDYT?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok will move to existing configmap

}
},
"namespace1-default-kopia": {
"priorityClassName": "medium-priority"
Copy link
Contributor

@ywk253100 ywk253100 Jun 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is requirement to set the priority class in backup repository level? All maintenance jobs should have the same priority? @Lyndon-Li WDYT?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree.

ywk253100
ywk253100 previously approved these changes Jun 4, 2025
Copy link
Contributor

@ywk253100 ywk253100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kaovilai
Copy link
Collaborator Author

kaovilai commented Jun 4, 2025

I think we'll still need install flag for daemonset priorityclass since it will be set to daemonset during install.

ywk253100
ywk253100 previously approved these changes Jun 4, 2025
ywk253100
ywk253100 previously approved these changes Jul 17, 2025
}
```

This will allow users to configure the priority class name for both the node agent daemonset and data mover pods through a single node-agent-configmap. For example:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for both the node agent daemonset and data mover pods.
This doesn't seem right. The PriorityClassName in Configs is only used for the data mover pods, not for the node-agent DaemonSet.

blackpiglet
blackpiglet previously approved these changes Jul 22, 2025
@blackpiglet blackpiglet merged commit 50c3094 into vmware-tanzu:main Jul 23, 2025
7 checks passed
@MaloLelandais MaloLelandais mentioned this pull request Jul 28, 2025
3 tasks
MaloLelandais pushed a commit to MaloLelandais/velero that referenced this pull request Jul 28, 2025
PriorityClass Support Design Proposal

Design for vmware-tanzu#8869

Signed-off-by: Tiger Kaovilai <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area/Design Design Documents kind/changelog-not-required PR does not require a user changelog. Often for docs, website, or build changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants