Skip to content

Conversation

@kaovilai
Copy link
Collaborator

Please add a summary of your change

This PR fixes issue #8870 where Velero was unnecessarily adding the restore-wait init container when restoring pods with volumes that were backed up using native datamover or CSI.

When restoring pods with volumes, Velero was always adding the restore-wait init container, even when the volumes were backed up using native datamover or CSI and didn't need file system restores. This was causing unnecessary overhead and potential issues.

Solution

Modified the pod volume restore action to check if any volumes actually need file system restores before adding the init container:

  1. Added a check in pkg/restore/actions/pod_volume_restore_action.go to determine if any volumes need file system restores by checking if there are matching PodVolumeBackups for the pod's volumes.
  2. Only add the restore-wait init container if there are volumes that need file system restores.
  3. Updated tests to match the new logic and renamed the test function to be more descriptive.

Testing

  • Added test cases to verify that the init container is only added when needed
  • Updated existing test cases to work with the new logic
  • All tests are passing

Signed-off-by: Tiger Kaovilai [email protected]

Thank you for contributing to Velero!

Does your change fix a particular issue?

Fixes #8870

Please indicate you've done the following:

This comment was marked as outdated.

@codecov
Copy link

codecov bot commented Apr 25, 2025

Codecov Report

Attention: Patch coverage is 85.00000% with 3 lines in your changes missing coverage. Please review.

Project coverage is 60.09%. Comparing base (805237a) to head (22372b3).
Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
pkg/restore/actions/pod_volume_restore_action.go 85.00% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8880      +/-   ##
==========================================
+ Coverage   60.07%   60.09%   +0.01%     
==========================================
  Files         378      378              
  Lines       42900    42911      +11     
==========================================
+ Hits        25771    25786      +15     
+ Misses      15583    15580       -3     
+ Partials     1546     1545       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@blackpiglet blackpiglet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to make some changes.
To me, the current implementation cannot fix the issue.
Please see my comments.

@kaovilai kaovilai changed the title Modified the pod volume restore action to check if any volumes actually need file system restores before adding the init container: Remove restore-wait initcontainer if not needed on restore Apr 28, 2025
@kaovilai
Copy link
Collaborator Author

test passing, rebasing.

blackpiglet
blackpiglet previously approved these changes Jun 30, 2025
@blackpiglet blackpiglet dismissed their stale review June 30, 2025 08:06

Change looks good to me, but we still need to fix some linter issue and DCO issue.

@blackpiglet
Copy link
Contributor

@kaovilai
Hi Tiger, the PR change looks good to me, but we still need to fix some linter issues found after rebasing. Please replace assert.NoError with require.NoError.
Please also fix the DCO check error.
It's better to squash the commits, too.

@kaovilai
Copy link
Collaborator Author

Will do

for i := range podVolumeBackupList.Items {
podVolumeBackups = append(podVolumeBackups, &podVolumeBackupList.Items[i])
}
volumeSnapshots := podvolume.GetVolumeBackupsForPod(podVolumeBackups, &pod, podFromBackup.Namespace)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would propose another solution:

  1. At the beginning of this action, remove the restore-wait init-container blindly without checking needsFileSystemRestore
  2. The let it go with the existing logics, if fs-backup is involved, the existing logic will add the restore-wait init-container back

This way looks more simpler.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Incorporated.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR ensures the restore-wait init container is only added when needed and removed if no file system restores are required.

  • Filters out existing restore-wait init containers before deciding to add
  • Adds the init container only when matching PodVolumeBackups exist
  • Updates and enhances tests to cover both adding and removing scenarios

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
pkg/restore/actions/pod_volume_restore_action.go Remove existing wait containers, check for volume backups, prepend new init container only when needed
pkg/restore/actions/pod_volume_restore_action_test.go Add new test suite for file-system restore cases and update existing tests
changelogs/unreleased/8880-kaovilai Changelog entry for the fix
Comments suppressed due to low confidence (2)

pkg/restore/actions/pod_volume_restore_action_test.go:42

  • The velerotest import is not used in this file; consider removing it to clean up unused dependencies.
	velerotest "github.com/vmware-tanzu/velero/pkg/test"

pkg/restore/actions/pod_volume_restore_action_test.go:43

  • The kube import is unused in this test; remove it to avoid unused imports.
	"github.com/vmware-tanzu/velero/pkg/util/kube"

@reasonerjt
Copy link
Contributor

@kaovilai Would you squash this PR?

@kaovilai
Copy link
Collaborator Author

Sure. You can also squash during merge anyway?

This PR fixes issue vmware-tanzu#8870 where Velero was unnecessarily adding the restore-wait init container when restoring pods with volumes that were backed up using native datamover or CSI.

When restoring pods with volumes, Velero was always adding the restore-wait init container, even when the volumes were backed up using native datamover or CSI and didn't need file system restores. This was causing unnecessary overhead and potential issues.

PVR action to remove restore-wait init container on restore

1. Added a check in `pkg/restore/actions/pod_volume_restore_action.go` to determine if any volumes need file system restores by checking if there are matching PodVolumeBackups for the pod's volumes.
2. Only add the restore-wait init container if there are volumes that need file system restores.
3. Updated tests to match the new logic and renamed the test function to be more descriptive.

- Added test cases to verify that the init container is only added when needed
- Updated existing test cases to work with the new logic
- All tests are passing

Signed-off-by: Tiger Kaovilai <[email protected]>

Refactor init container assertions in pod volume restore tests for clarity

Signed-off-by: Tiger Kaovilai <[email protected]>

Fix restore-wait init container removal to handle multiple containers

Address PR feedback by implementing comprehensive removal of ALL restore-wait
init containers (both current and legacy names) to prevent duplicates.

Previously, the code only removed the first init container if it matched the
restore-wait names. This could leave behind multiple restore-wait containers
or fail to remove them if they weren't at position 0.

Changes:
- Remove ALL existing restore-wait init containers before deciding whether to add a new one
- This covers both scenarios: when no file system restore is needed AND when preventing duplicates
- Simplify the add logic since we've already cleaned up existing containers
- Add better logging to show how many containers were removed

Signed-off-by: Tiger Kaovilai <[email protected]>

Address PR feedback: clarify comments and fix test logic

- Clarify comments explaining when restore-wait init container is needed
 - Fix test expectations for matching PodVolumeBackups (should add init container)
 - Add backup labels to PodVolumeBackups in tests for proper filtering
 - Remove incorrect test assertion that checked init containers should not be restore-wai

Signed-off-by: Tiger Kaovilai <[email protected]>
@reasonerjt reasonerjt removed their request for review July 18, 2025 05:52
@blackpiglet blackpiglet merged commit 2b787f5 into vmware-tanzu:main Jul 21, 2025
46 checks passed
kaovilai added a commit to kaovilai/velero that referenced this pull request Jul 23, 2025
…anzu#8880)

This PR fixes issue vmware-tanzu#8870 where Velero was unnecessarily adding the restore-wait init container when restoring pods with volumes that were backed up using native datamover or CSI.

When restoring pods with volumes, Velero was always adding the restore-wait init container, even when the volumes were backed up using native datamover or CSI and didn't need file system restores. This was causing unnecessary overhead and potential issues.

PVR action to remove restore-wait init container on restore

Changes:
- Remove ALL existing restore-wait init containers before deciding whether to add a new one
- This covers both scenarios: when no file system restore is needed AND when preventing duplicates
- Simplify the add logic since we've already cleaned up existing containers
- Add better logging to show how many containers were removed

Signed-off-by: Tiger Kaovilai <[email protected]>
@MaloLelandais MaloLelandais mentioned this pull request Jul 28, 2025
3 tasks
MaloLelandais pushed a commit to MaloLelandais/velero that referenced this pull request Jul 28, 2025
…anzu#8880)

This PR fixes issue vmware-tanzu#8870 where Velero was unnecessarily adding the restore-wait init container when restoring pods with volumes that were backed up using native datamover or CSI.

When restoring pods with volumes, Velero was always adding the restore-wait init container, even when the volumes were backed up using native datamover or CSI and didn't need file system restores. This was causing unnecessary overhead and potential issues.

PVR action to remove restore-wait init container on restore

Changes:
- Remove ALL existing restore-wait init containers before deciding whether to add a new one
- This covers both scenarios: when no file system restore is needed AND when preventing duplicates
- Simplify the add logic since we've already cleaned up existing containers
- Add better logging to show how many containers were removed

Signed-off-by: Tiger Kaovilai <[email protected]>
@pawcykca
Copy link

When do you plan to release this fix?

@kaovilai
Copy link
Collaborator Author

See milestone assigned to #8870

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Velero shouldn't restore the restore-wait init container

5 participants