Skip to content

feat: Add S3 artifact repository configuration options for parallel transfers and resource limits#14571

Draft
usulkies wants to merge 5 commits intoargoproj:mainfrom
usulkies:uziel/s3_parallel
Draft

feat: Add S3 artifact repository configuration options for parallel transfers and resource limits#14571
usulkies wants to merge 5 commits intoargoproj:mainfrom
usulkies:uziel/s3_parallel

Conversation

@usulkies
Copy link
Contributor

@usulkies usulkies commented Jun 16, 2025

feat: Add S3 artifact repository configuration options for parallel transfers and resource limits

Fixes #12442, #9022, #4014

Motivation

The current S3 artifact repository implementation has performance limitations that cause slow upload/download times:

  1. Many small files: Sequential directory uploads/downloads are slow when dealing with hundreds or thousands of files
  2. Large single files: Single-threaded transfers for large files (e.g., 50GB) take excessive time
  3. No configurability: Users cannot tune parallelism settings for their specific use cases

This feature addresses these issues by adding configurable parallelism options that leverage MinIO's built-in multipart capabilities and implement worker pools for directory operations.

Modifications

1. Added Configuration Options to S3Bucket

Added five new optional fields to the S3Bucket type:

  • enableParallelism: Enable/disable parallel operations (default: true)
  • parallelism: Number of concurrent workers for parallel operations (default: 10)
  • fileCountThreshold: Minimum number of files in a directory to trigger parallel operations (default: 10)
  • fileSizeThreshold: Minimum file size to trigger multipart upload/download (default: "64Mi", supports Kubernetes resource quantity strings like "64Mi", "1Gi")
  • partSize: Part size for multipart uploads (default: minio default, typically 128MB, supports Kubernetes resource quantity strings)

2. Implementation Details

Parallel Directory Operations

  • PutDirectory: When enableParallelism is true and file count exceeds fileCountThreshold, uses a worker pool to upload files concurrently
  • GetDirectory: When enableParallelism is true and file count exceeds fileCountThreshold, uses a worker pool to download files concurrently
  • Worker pool size is controlled by the parallelism setting

Large File Multipart Operations

  • PutFile: For files larger than fileSizeThreshold, configures MinIO's FPutObject with NumThreads and PartSize for parallel multipart uploads
  • GetFile: For files larger than fileSizeThreshold, leverages MinIO's automatic multipart download capabilities
  • Uses MinIO's built-in multipart upload/download with configurable thread count

3. Configuration Methods

The parallelism configuration can be set in two ways:

Via Artifact Repository (ConfigMap)

apiVersion: v1
kind: ConfigMap
metadata:
  name: artifact-repositories
data:
  default-v1: |
    archiveLogs: true
    s3:
      bucket: my-bucket
      endpoint: minio:9000
      insecure: true
      accessKeySecret:
        name: my-minio-cred
        key: accesskey
      secretKeySecret:
        name: my-minio-cred
        key: secretkey
      enableParallelism: true
      parallelism: 10
      fileCountThreshold: 10
      fileSizeThreshold: 64Mi
      partSize: 128Mi

Via Inline Artifact Properties

outputs:
  artifacts:
    - name: large-file
      path: /tmp/large-file
      s3:
        enableParallelism: true
        parallelism: 5
        fileSizeThreshold: 10Mi
        partSize: 32Mi

4. Code Changes

  • pkg/apis/workflow/v1alpha1/workflow_types.go: Added new fields to S3Bucket struct
  • workflow/artifacts/s3/s3.go:
    • Added parallelism fields to S3ClientOpts and s3client structs
    • Implemented putDirectoryParallel() and getDirectoryParallel() with worker pools
    • Enhanced PutFile() and GetFile() to configure MinIO multipart options for large files
  • workflow/artifacts/artifacts.go: Pass parallelism configuration from artifact repository to S3 client
  • Generated files: Updated protobuf, OpenAPI, CRDs, and documentation

Verification

Unit Tests

  • Added comprehensive tests in workflow/artifacts/s3/s3_parallel_test.go
  • Tested parallel directory upload/download with various file counts
  • Tested configuration parsing and defaults
  • All existing tests pass

Integration Testing

  • Successfully submitted 30+ example workflows from the examples/ directory
  • All workflows accepted by API with no validation errors
  • Verified backward compatibility (existing workflows continue to work)

Performance Testing

  • Tested with workflows containing many files (20+ files) - parallel upload works
  • Tested with large single files (100MB+) - multipart upload with parallelism works
  • Verified configuration is properly applied from both artifact repository and inline settings

Documentation

  • Updated docs/fields.md with new S3Bucket configuration options
  • Created feature documentation in .features/pending/s3-parallelism-config.md
  • Updated OpenAPI schema and JSON schema
  • Updated CRD manifests

Breaking Changes

None. All new fields are optional with sensible defaults, ensuring backward compatibility.

Example Use Cases

  1. CI/CD pipelines with many artifacts: Upload hundreds of test result files in parallel
  2. Large dataset transfers: Transfer 50GB+ files using multipart uploads with multiple threads
  3. High-throughput workflows: Configure higher parallelism for workflows that frequently transfer artifacts

@usulkies usulkies marked this pull request as draft June 16, 2025 15:14
@usulkies usulkies force-pushed the uziel/s3_parallel branch from 4c09dff to c52f206 Compare June 16, 2025 15:49
@usulkies usulkies marked this pull request as ready for review June 17, 2025 12:56
@usulkies usulkies marked this pull request as draft June 17, 2025 14:30
@usulkies usulkies marked this pull request as draft June 17, 2025 14:30
Copy link
Member

@Joibel Joibel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some initial thoughts on this PR. Thanks for working on it.

@usulkies usulkies marked this pull request as ready for review June 19, 2025 07:32
@usulkies usulkies marked this pull request as draft June 19, 2025 21:48
@antoinetran
Copy link
Contributor

Hi! Would be very interested in this PR, because we have a factor 10 of speed difference, when we save artifact of 35GiB, it takes 2m with awscli and 23m with argo. The options here are great!
Any issue that block merging to main?

@usulkies
Copy link
Contributor Author

usulkies commented Dec 6, 2025

Hi! Would be very interested in this PR, because we have a factor 10 of speed difference, when we save artifact of 35GiB, it takes 2m with awscli and 23m with argo. The options here are great! Any issue that block merging to main?

Well, yes...
This didn't work fully. Need to find some time to try it again.

…12442 argoproj#9022 argoproj#4014

Signed-off-by: Uziel Sulkies <10584010+usulkies@users.noreply.github.com>
argoproj#12442 argoproj#9022 argoproj#4014

Signed-off-by: Uziel Sulkies <10584010+usulkies@users.noreply.github.com>
…mentation. Fixes argoproj#12442 argoproj#9022 argoproj#4014

Signed-off-by: Uziel Sulkies <10584010+usulkies@users.noreply.github.com>
…ges in fileSizeThreshold and partSize types. Fixes argoproj#12442 argoproj#9022 argoproj#4014

Signed-off-by: Uziel Sulkies <10584010+usulkies@users.noreply.github.com>
…ds. Fixes argoproj#12442 argoproj#9022 argoproj#4014

Signed-off-by: Uziel Sulkies <10584010+usulkies@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement parallelization to speed up S3 artifacts upload and download

3 participants