feat: Add S3 artifact repository configuration options for parallel transfers and resource limits#14571
Draft
usulkies wants to merge 5 commits intoargoproj:mainfrom
Draft
feat: Add S3 artifact repository configuration options for parallel transfers and resource limits#14571usulkies wants to merge 5 commits intoargoproj:mainfrom
usulkies wants to merge 5 commits intoargoproj:mainfrom
Conversation
4c09dff to
c52f206
Compare
Joibel
requested changes
Jun 18, 2025
Member
Joibel
left a comment
There was a problem hiding this comment.
Just some initial thoughts on this PR. Thanks for working on it.
Contributor
|
Hi! Would be very interested in this PR, because we have a factor 10 of speed difference, when we save artifact of 35GiB, it takes 2m with awscli and 23m with argo. The options here are great! |
Contributor
Author
Well, yes... |
2ba73d9 to
05beb6f
Compare
…12442 argoproj#9022 argoproj#4014 Signed-off-by: Uziel Sulkies <10584010+usulkies@users.noreply.github.com>
argoproj#12442 argoproj#9022 argoproj#4014 Signed-off-by: Uziel Sulkies <10584010+usulkies@users.noreply.github.com>
…mentation. Fixes argoproj#12442 argoproj#9022 argoproj#4014 Signed-off-by: Uziel Sulkies <10584010+usulkies@users.noreply.github.com>
…ges in fileSizeThreshold and partSize types. Fixes argoproj#12442 argoproj#9022 argoproj#4014 Signed-off-by: Uziel Sulkies <10584010+usulkies@users.noreply.github.com>
…ds. Fixes argoproj#12442 argoproj#9022 argoproj#4014 Signed-off-by: Uziel Sulkies <10584010+usulkies@users.noreply.github.com>
4a563f5 to
1b8faf2
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
feat: Add S3 artifact repository configuration options for parallel transfers and resource limits
Fixes #12442, #9022, #4014
Motivation
The current S3 artifact repository implementation has performance limitations that cause slow upload/download times:
This feature addresses these issues by adding configurable parallelism options that leverage MinIO's built-in multipart capabilities and implement worker pools for directory operations.
Modifications
1. Added Configuration Options to
S3BucketAdded five new optional fields to the
S3Buckettype:enableParallelism: Enable/disable parallel operations (default:true)parallelism: Number of concurrent workers for parallel operations (default:10)fileCountThreshold: Minimum number of files in a directory to trigger parallel operations (default:10)fileSizeThreshold: Minimum file size to trigger multipart upload/download (default:"64Mi", supports Kubernetes resource quantity strings like"64Mi","1Gi")partSize: Part size for multipart uploads (default: minio default, typically128MB, supports Kubernetes resource quantity strings)2. Implementation Details
Parallel Directory Operations
PutDirectory: WhenenableParallelismis true and file count exceedsfileCountThreshold, uses a worker pool to upload files concurrentlyGetDirectory: WhenenableParallelismis true and file count exceedsfileCountThreshold, uses a worker pool to download files concurrentlyparallelismsettingLarge File Multipart Operations
PutFile: For files larger thanfileSizeThreshold, configures MinIO'sFPutObjectwithNumThreadsandPartSizefor parallel multipart uploadsGetFile: For files larger thanfileSizeThreshold, leverages MinIO's automatic multipart download capabilities3. Configuration Methods
The parallelism configuration can be set in two ways:
Via Artifact Repository (ConfigMap)
Via Inline Artifact Properties
4. Code Changes
pkg/apis/workflow/v1alpha1/workflow_types.go: Added new fields toS3Bucketstructworkflow/artifacts/s3/s3.go:S3ClientOptsands3clientstructsputDirectoryParallel()andgetDirectoryParallel()with worker poolsPutFile()andGetFile()to configure MinIO multipart options for large filesworkflow/artifacts/artifacts.go: Pass parallelism configuration from artifact repository to S3 clientVerification
Unit Tests
workflow/artifacts/s3/s3_parallel_test.goIntegration Testing
examples/directoryPerformance Testing
Documentation
docs/fields.mdwith new S3Bucket configuration options.features/pending/s3-parallelism-config.mdBreaking Changes
None. All new fields are optional with sensible defaults, ensuring backward compatibility.
Example Use Cases