feat: add vLLM Dynamo image to Earthly #700

hhzhang16 · 2025-04-16T00:57:19Z

Overview:

This MR updates Earthfile and creates container/Earthfile for Dynamo vLLM Docker image support for easy building of a succinct image and for the CI.

Details:

Added a new target dynamo-base-docker-llm to build and verify the installation of both Dynamo and vllm
Created container/Earthfile containing a vllm-build target that builds a patched version of the vllm package. It will only need to rebuild when changes are made to the containers/deps/vllm subdirectory
Reduced vllm image size from dynamo:latest-vllm-local-dev being 34GB to the earthly build resulting in a 13GB image.

Simple docker run result to show NIXL and GPU support:

> docker run --gpus all --rm -it my-registry/dynamo-base-docker-llm /bin/bash
root@36daedf25032:/workspace# python3
Python 3.12.3 (main, Feb  4 2025, 14:48:35) [GCC 13.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import vllm
INFO 04-17 05:21:44 __init__.py:190] Automatically detected platform cuda.
WARNING 04-17 05:21:44 cuda.py:336] Detected different devices in the system: 
WARNING 04-17 05:21:44 cuda.py:336] NVIDIA GeForce GT 1030
WARNING 04-17 05:21:44 cuda.py:336] NVIDIA RTX A6000
WARNING 04-17 05:21:44 cuda.py:336] NVIDIA RTX A6000
WARNING 04-17 05:21:44 cuda.py:336] Please make sure to set `CUDA_DEVICE_ORDER=PCI_BUS_ID` to avoid unexpected behavior.
INFO 04-17 05:21:44 nixl.py:16] NIXL is available
>>>

Confirmed that this 13GB image can be used as a dynamo base image and the resulting dynamo deployment is successful and can be handled by the image builder on K8s (the 34GB one will crash the image builder)

Where should the reviewer start?

Run earthly +dynamo-base-docker-llm to test locally

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

closes GitHub issue: [FEATURE]: Add vLLM to Earthly for CI #697

Summary by CodeRabbit

New Features
- Introduced a new build process for CUDA-enabled environments, including a reusable setup for CUDA toolkit and NVIDIA drivers.
- Added a new build target for creating a Docker image with minimal NIXL dependencies and pre-installed vllm Python wheel.
- Added a new multi-stage container build process supporting vllm, uv, and nixl components.
Improvements
- Streamlined Docker image build and push steps with simplified commands and updated environment variable usage.
- Enhanced build orchestration by reorganizing and parameterizing build targets for greater flexibility and maintainability.
Documentation
- Updated build and push instructions to reflect new Earthly-based workflow and naming conventions.

…file-improvements

…/dynamo into hannahz/earthfile-improvements

…arthly image [WIP]

…file-improvements

copy-pr-bot · 2025-04-16T00:57:23Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

…file-improvements

github-actions · 2025-06-21T09:34:30Z

This PR is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

coderabbitai · 2025-06-21T09:34:36Z

Walkthrough

A new multi-stage Earthfile was added for building vLLM, uv, and NIXL components in a CUDA-enabled environment. The main Earthfile was refactored to introduce reusable CUDA setup, update Docker image argument conventions, add a dedicated LLM base image build, and reorganize build orchestration targets. The README was updated to reflect the new Earthly-based image build and push workflow.

Changes

File(s)	Change Summary
Earthfile	Added `SETUP_CUDA` function; refactored CUDA setup; renamed Docker args; added `dynamo-base-docker-llm` and orchestration targets; updated image build flow.
README.md	Updated instructions to use Earthly with new Docker argument conventions and image naming.
container/Earthfile	New file: defines multi-stage builds for `vllm-build`, `uv-source`, `nixl-source`, and `nixl-base` with artifact management and CUDA support.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Earthfile
    participant Docker
    participant ArtifactStore

    User->>Earthfile: Run +dynamo-base-docker-llm (with DOCKER_SERVER, IMAGE_TAG)
    Earthfile->>Docker: Build image with CUDA, NIXL, vLLM, UCX
    Earthfile->>ArtifactStore: Copy NIXL/vLLM/uv artifacts from container/Earthfile
    Earthfile->>Docker: Install dependencies, set env vars, verify imports
    Earthfile->>Docker: Push built image to DOCKER_SERVER with IMAGE_TAG

Possibly related PRs

build: fixes to enable vLLM slim runtime image #1058: Both PRs focus on building and packaging vLLM and NIXL with CUDA and UCX support, but use different build systems and approaches for improving vLLM runtime image builds.

Suggested labels

size/M, feat, build

Suggested reviewers

tanmayv25
nnshah1
alec-flowers
ishandhanani

Poem

In the warren where builds are spun,
Earthfiles now dance, their work begun.
CUDA and NIXL, vLLM too—
All in one image, shiny and new!
Docker tags hop, arguments leap,
This bunny’s code is tidy and neat.
🐇✨

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

container/Earthfile (2)
60-61: Consider using a more modern manylinux tag.

The manylinux1 tag is deprecated. Consider using manylinux2014 or manylinux_2_17 for better library support while maintaining compatibility.
-        sed -i 's/Tag: cp38-abi3-linux_x86_64/Tag: cp38-abi3-manylinux1_x86_64/g' ${VLLM_PATCHED_PACKAGE_NAME}-${VLLM_PATCHED_PACKAGE_VERSION}.dist-info/WHEEL && \
-        sed -i "s/-cp38-abi3-linux_x86_64.whl/-cp38-abi3-manylinux1_x86_64.whl/g" ${VLLM_PATCHED_PACKAGE_NAME}-${VLLM_PATCHED_PACKAGE_VERSION}.dist-info/RECORD && \
+        sed -i 's/Tag: cp38-abi3-linux_x86_64/Tag: cp38-abi3-manylinux2014_x86_64/g' ${VLLM_PATCHED_PACKAGE_NAME}-${VLLM_PATCHED_PACKAGE_VERSION}.dist-info/WHEEL && \
+        sed -i "s/-cp38-abi3-linux_x86_64.whl/-cp38-abi3-manylinux2014_x86_64.whl/g" ${VLLM_PATCHED_PACKAGE_NAME}-${VLLM_PATCHED_PACKAGE_VERSION}.dist-info/RECORD && \
83-83: Consider providing a default value for NIXL_COMMIT.

The NIXL_COMMIT argument is required but has no default value, which could cause build failures if not provided.
-    ARG NIXL_COMMIT
+    ARG NIXL_COMMIT=main  # or a specific commit hash
Alternatively, add a build-time check to ensure it's provided:
     ARG NIXL_COMMIT
+    RUN test -n "${NIXL_COMMIT}" || (echo "Error: NIXL_COMMIT must be provided" && exit 1)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 831bc72 and 5b0e15f.

📒 Files selected for processing (3)

Earthfile (4 hunks)
README.md (1 hunks)
container/Earthfile (1 hunks)

🔇 Additional comments (8)

container/Earthfile (2)

67-70: LGTM!

Clean and efficient way to extract the uv binary from the official image.

144-144: The previous ripgrep run didn’t search any files—let’s locate the Earthfile and inspect both the base image and the CUDA flag:
#!/bin/bash
# Locate the Earthfile
earthfile=$(find . -type f -name Earthfile | head -n1)
echo "Using Earthfile: $earthfile"

# Show the Docker base image line
grep -n "FROM nvcr.io/nvidia/cuda-dl-base" "$earthfile" || echo "Base image line not found"

# Show UCX build flags for CUDA
grep -n "with-cuda" "$earthfile" || echo "No 'with-cuda' flags found"

# Show any references to /usr/local/cuda
grep -n "/usr/local/cuda" "$earthfile" || echo "No '/usr/local/cuda' references found"
README.md (2)

60-64: LGTM!

The updated build instructions using Earthly are clear and align with the new build process.

68-68: LGTM!

Correctly references the new LLM-optimized image with the -llm suffix.

Earthfile (4)

61-85: LGTM!

Excellent implementation of a reusable CUDA setup function with comprehensive environment variables and proper error handling.

100-100: LGTM!

Good refactoring to use the reusable SETUP_CUDA function.

150-151: LGTM!

The new argument names are more descriptive and flexible than the CI-specific ones.

224-241: LGTM!

Excellent reorganization of build targets into logical groups. The separation between cloud components and base images improves maintainability and allows for more granular builds.

coderabbitai · 2025-06-21T09:37:46Z

container/Earthfile

+
+    RUN mkdir -p /workspace/dist
+    RUN mkdir /tmp/vllm && \
+        pip download --only-binary=:all: --no-deps --dest /tmp/vllm vllm==v${VLLM_REF} && \


⚠️ Potential issue

Fix the version specifier in pip download command.

The PyPI version format doesn't include the 'v' prefix. This will cause the download to fail.

Apply this diff to fix the version specifier:

- pip download --only-binary=:all: --no-deps --dest /tmp/vllm vllm==v${VLLM_REF} && \ + pip download --only-binary=:all: --no-deps --dest /tmp/vllm vllm==${VLLM_REF} && \

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

pip download --only-binary=:all: --no-deps --dest /tmp/vllm vllm==v${VLLM_REF} && \

pip download --only-binary=:all: --no-deps --dest /tmp/vllm vllm==${VLLM_REF} && \

🤖 Prompt for AI Agents

In container/Earthfile at line 50, the pip download command incorrectly includes a 'v' prefix in the version specifier for vllm, which is not valid for PyPI versions. Remove the 'v' prefix from the version specifier so that it reads vllm==${VLLM_REF} instead of vllm==v${VLLM_REF} to ensure the download succeeds.

coderabbitai · 2025-06-21T09:37:46Z

Earthfile

+        rm -rf /workspace/ai_dynamo_vllm-*.whl
+
+    # Verify both Dynamo and vllm are properly installed
+    RUN python3 -c "import dynamo; import vllm" || (echo "Failed to import Dynamo or vllm" && exit 1)


🛠️ Refactor suggestion

Use the virtual environment Python for import verification.

The verification command uses python3 which might not be the virtual environment Python. Use the activated environment's Python instead.

- RUN python3 -c "import dynamo; import vllm" || (echo "Failed to import Dynamo or vllm" && exit 1) + RUN . /opt/dynamo/venv/bin/activate && python -c "import dynamo; import vllm" || (echo "Failed to import Dynamo or vllm" && exit 1)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

RUN python3 -c "import dynamo; import vllm" || (echo "Failed to import Dynamo or vllm" && exit 1)

RUN . /opt/dynamo/venv/bin/activate && python -c "import dynamo; import vllm" || (echo "Failed to import Dynamo or vllm" && exit 1)

🤖 Prompt for AI Agents

In Earthfile at line 216, the import verification uses `python3` which may not point to the virtual environment's Python interpreter. Change the command to use the Python executable from the activated virtual environment, typically referenced by `$(which python)` or `$VIRTUAL_ENV/bin/python`, to ensure the imports are checked within the correct environment.

grahamking · 2025-08-13T20:14:35Z

Earthly is going away: #2154

hhzhang16 and others added 16 commits April 11, 2025 16:53

feat: no parallelism so do some reverting of earthfile

b7fbfe3

feat: copy files to sdk/cli/bin

e5598d4

feat: split cargo build and copying of files so that they can cache

44bd0ac

feat: fix earthfile!

3a64d20

Merge branch 'main' of github.com:ai-dynamo/dynamo into hannahz/earth…

c3fe46b

…file-improvements

Merge branch 'main' into hannahz/earthfile-improvements

0d4c52b

ci: use wheels instead of .

fb937ff

Merge branch 'hannahz/earthfile-improvements' of github.com:ai-dynamo…

42f4251

…/dynamo into hannahz/earthfile-improvements

ci: earthfile improvements

d12a06c

ci: replace cp with symlinks

18eafed

feat: remove README from Cargo.toml

42807cd

ci: cache build steps, slim down final image to 585MB

3f5b1ea

Merge branch 'main' into hannahz/earthfile-improvements

95e913d

ci: add cuda to dynamo base docker

babf871

ci: add Earthfile for vLLM (patched) and add dynamo-base-docker-llm E…

34d7162

…arthly image [WIP]

Merge branch 'main' of github.com:ai-dynamo/dynamo into hannahz/earth…

6309c92

…file-improvements

pull-request-size bot added the size/L label Apr 16, 2025

hhzhang16 added 7 commits April 15, 2025 18:07

ci: install cuda in dynamo base docker llm

530d886

[wip] feat: add nixl

5b8eede

feat: move cuda setup to an Earthly function

0c0a46d

ci: working container/Earthfile for NIXL

527dad7

feat: cut vLLM docker iamge down to 13GB

36bff6e

Merge branch 'main' of github.com:ai-dynamo/dynamo into hannahz/earth…

5ef2587

…file-improvements

feat: fix all-docker

225d51c

hhzhang16 marked this pull request as ready for review April 18, 2025 22:26

hhzhang16 requested review from ishandhanani, ptarasiewiczNV, rmccorm4 and tanmayv25 as code owners April 18, 2025 22:26

hhzhang16 requested review from a team, alec-flowers and nnshah1 as code owners April 18, 2025 22:26

hhzhang16 mentioned this pull request Apr 20, 2025

[FEATURE]: Migrate from Earthly to docker #755

Closed

hhzhang16 requested review from biswapanda, hutm, julienmancuso and mohammedabdulwahhab April 21, 2025 17:36

feat: update docker server and image tag args

5b0e15f

github-actions bot added the Stale label Jun 21, 2025

github-actions bot added the feat label Jun 21, 2025

coderabbitai bot reviewed Jun 21, 2025

View reviewed changes

github-actions bot removed the Stale label Jul 4, 2025

grahamking closed this Aug 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add vLLM Dynamo image to Earthly #700

feat: add vLLM Dynamo image to Earthly #700

Uh oh!

hhzhang16 commented Apr 16, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

copy-pr-bot bot commented Apr 16, 2025

Uh oh!

github-actions bot commented Jun 21, 2025

Uh oh!

coderabbitai bot commented Jun 21, 2025 •

edited

Loading

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jun 21, 2025

Uh oh!

coderabbitai bot Jun 21, 2025

Uh oh!

grahamking commented Aug 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	pip download --only-binary=:all: --no-deps --dest /tmp/vllm vllm==v${VLLM_REF} && \
	pip download --only-binary=:all: --no-deps --dest /tmp/vllm vllm==${VLLM_REF} && \

	RUN python3 -c "import dynamo; import vllm" \|\| (echo "Failed to import Dynamo or vllm" && exit 1)
	RUN . /opt/dynamo/venv/bin/activate && python -c "import dynamo; import vllm" \|\| (echo "Failed to import Dynamo or vllm" && exit 1)

feat: add vLLM Dynamo image to Earthly #700

feat: add vLLM Dynamo image to Earthly #700

Uh oh!

Conversation

hhzhang16 commented Apr 16, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Apr 16, 2025

Uh oh!

github-actions bot commented Jun 21, 2025

Uh oh!

coderabbitai bot commented Jun 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 21, 2025

Choose a reason for hiding this comment

Uh oh!

grahamking commented Aug 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hhzhang16 commented Apr 16, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jun 21, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)