-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Description
Describe the problem/challenge you have
We need to perform safe stateful migrations in scale across different clusters distributions and across different storage providers.
Podhooks are useful for quiesce/unquiesce workloads but we observe multiple times that platform engineers do not have the luxury/visibility/time/knowledge to go to each pod and add specific commands to quiesce/unquiesce workloads.
Additionally, today Velero’s Restic integration does not backup/migrate orphan PVC/PV pairs (without a pod).
Describe the solution you'd like
We propose to create four new plugin hooks:
PreBackup
PostBackup
PreRestore
PostRestore
Once the hooks are implemented on the Velero core, we would write a custom plugin for Velero that would get triggered by the PreBackup hook.
Prior to starting the backup and building the serialization of objects, this plugin would quiesce all the workloads setting replicas=0 on deployments, statefulsets, etc, and then mount all PVC/PV pairs with a staging pod.
Then Velero would take the backup of all objects: the deployments with replicas=0, and the staging pod and all PVC/PV pairs.
To complete the migration of the workload, a restore would happen on the destination cluster with the quiesced workload.
A custom Velero plugin would get triggered on the PostRestore hook and unquiesce the workload deleting the staging pod and reinstating the number of the original replicas on the deployment, statefulsets, etc.
Of course, the pre/post backup/restore custom logic can be written outside Velero, but we would need an operator/orchestrator for such. We think Velero can add this capability under its umbrella. The same hooks can integrate Velero with other capabilities (imagine a PreRestore velero plugin that calls Cluster API to scale up cluster size prior to restore).
cc: @dsu-igeek @eleanor-millman @codegold79
Anything else you would like to add:
I will put a design proposal for such hooks, please stay tuned. And volunteering to write those if the design is approved.
Once these hooks are part of the Velero core, we can consider opening source the Velero plugin that generically quiesces/unquiesces workloads (with a big disclaimer that specific pre/post pod hooks are preferrable).
Vote on this issue!
This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.
- 👍 for "The project would be better with this feature added"
- 👎 for "This feature will not enhance the project in a meaningful way"