-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Specify timeout for plugins of specific Kind. #3018
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
phuongatemc
wants to merge
1
commit into
vmware-tanzu:main
Choose a base branch
from
phuongatemc:plugins-timeout
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file was deleted.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,34 @@ | ||
| ## Timeout for Plugins | ||
| This document proposes a solution that allows user to specify timeout during backup for all plugin executions on an object of specific resource types. | ||
|
|
||
| ## Background | ||
| The execution of plugins in either Backup or Restore datapath is blocking call. If the plugin code does not behave as expected and return within certain period of time, the application may fail. For example in App Consistent backup, the application pod will be quiesced while plugin being executed so user operations are blocked until the plugin execution completes and pod is unquiesced. If for some reason the plugin execution takes longer than certain threshold or hang, the application starts to fail user operations. | ||
|
|
||
| ## Goals | ||
| - Enable user to specify timeout during backup for all plugin executions on an object of specific resource types. | ||
|
|
||
| ## Non-goals | ||
| - Restore data path may require similar timeout but it is beyond the scope of this change due to the complexity of dependency between restoring objects. | ||
|
|
||
| ## Alternatives Considered | ||
| - Set timeout for the entire Velero Backup or Restore to avoid the hanging of the backup or restore workflow. This only helps in some scenario but it will not help the AppConsistent pod being quiesced for too long. | ||
| - Set timeout for each plugin being executed on an object. In cases that multiple plugins being executed on an object, the total time of these plugin executions may be still greater than the limit even though individual timeout of each plugin would be smaller than timeout. | ||
|
|
||
|
|
||
| ## High-Level Design | ||
| - Enhance the BackupSpec to contain the map of resource type (Kind) to the timeout of executing all plugins being executed on an object of that resource type. | ||
| type BackupSpec struct { | ||
| ... | ||
| TotalPluginsTimeouts map[string]int `json:"totalPluginsTimeouts,omitempty"` //map of resource type to total plugins timeout (in milliseconds) | ||
| } | ||
| - During backup an item, if the item's type match with the type specify in the TotalPluginsTimeouts map, all the plugins will have to be executed within that timeout. If item's type is not in the map, then execute the plugins without timeout. | ||
|
|
||
| ### Changes to item_backupper.go | ||
| - In itemBackupper.backupItem, use the resource type of the item to look up the TotalPluginsTimeouts. If the timeout exists, create a goroutine to run executeActions to make sure that function would be done within the specified timeout or cancel it. If timeout does not exist, the run executeAction directly as before. | ||
|
|
||
| ### Changes to velero CLI | ||
| Add new flag "--total-plugins-timeouts" to Velero backup create command which takes a string of key-values pairs which represents the map between resource type and the total plugins timeout of such resource type. The key-value pairs are separated by semicolon. | ||
|
|
||
| Example: | ||
| >velero backup create mybackup --total-plugins-timeouts "persistentvolumeclaims=30000,pods=60000" | ||
|
|
||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you give an example of what you envision the keys within this map to look like? Is it matching on a Kubernetes
Kind?If so, is that really only meaningful on BackupItemAction plugins, and not on any of the other plugin types, like ObjectStorage or VolumeSnapshotter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The key in this map is the Kubernetes Kind.
In theory, I think I can extend this design for all plugins, not just BackupItemAction. Not sure if I can have time to implement for all.