-
Notifications
You must be signed in to change notification settings - Fork 755
fix: remove existing /workspace from CUDA base image before COPY oper… #2382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: remove existing /workspace from CUDA base image before COPY oper… #2382
Conversation
…ation The CUDA base image maintained by the CUDA team includes a /workspace path. Remove this existing path and recreate as directory to avoid COPY conflicts.
WalkthroughThis change updates three Dockerfiles to explicitly remove any existing Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~7 minutes Possibly related PRs
Poem
Note 🔌 MCP (Model Context Protocol) integration is now available in Early Access!Pro users can now connect to remote MCP servers under the Integrations page to get reviews and chat conversations that understand additional development context. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
container/Dockerfile.sglang(1 hunks)container/Dockerfile.tensorrt_llm(1 hunks)container/Dockerfile.vllm(1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: nnshah1
PR: ai-dynamo/dynamo#1444
File: tests/fault_tolerance/scenarios.py:57-57
Timestamp: 2025-07-01T15:39:56.789Z
Learning: The fault tolerance tests in tests/fault_tolerance/ are designed to run only in the mounted container environment, so hardcoded absolute paths with `/workspace/` prefix are intentional and should not be changed to relative paths.
Learnt from: grahamking
PR: ai-dynamo/dynamo#1177
File: container/Dockerfile.vllm:102-105
Timestamp: 2025-05-28T22:54:46.875Z
Learning: In Dockerfiles, when appending to environment variables that may not exist in the base image, Docker validation will fail if you reference undefined variables with ${VARIABLE} syntax. In such cases, setting the environment variable directly (e.g., ENV CPATH=/usr/include) rather than appending is the appropriate approach.
📚 Learning: 2025-07-01T15:39:56.789Z
Learnt from: nnshah1
PR: ai-dynamo/dynamo#1444
File: tests/fault_tolerance/scenarios.py:57-57
Timestamp: 2025-07-01T15:39:56.789Z
Learning: The fault tolerance tests in tests/fault_tolerance/ are designed to run only in the mounted container environment, so hardcoded absolute paths with `/workspace/` prefix are intentional and should not be changed to relative paths.
Applied to files:
container/Dockerfile.vllmcontainer/Dockerfile.sglangcontainer/Dockerfile.tensorrt_llm
📚 Learning: 2025-05-28T22:54:46.875Z
Learnt from: grahamking
PR: ai-dynamo/dynamo#1177
File: container/Dockerfile.vllm:102-105
Timestamp: 2025-05-28T22:54:46.875Z
Learning: In Dockerfiles, when appending to environment variables that may not exist in the base image, Docker validation will fail if you reference undefined variables with ${VARIABLE} syntax. In such cases, setting the environment variable directly (e.g., ENV CPATH=/usr/include) rather than appending is the appropriate approach.
Applied to files:
container/Dockerfile.vllm
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build and Test - dynamo
| # This path exists from the base image, maintained by the CUDA team. Don't need it, so remove it. | ||
| RUN rm -f /workspace && mkdir -p /workspace | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Same issue: rm -f on a directory fails; and this runs after prior COPYs
- rm -f /workspace will fail if /workspace is already a directory in this stage (prior WORKDIR/COPY likely created it).
- If /workspace were a file/symlink from a CUDA image, this should be run before any WORKDIR/COPY using /workspace.
Make deletion conditional and always ensure the directory exists:
-# This path exists from the base image, maintained by the CUDA team. Don't need it, so remove it.
-RUN rm -f /workspace && mkdir -p /workspace
+# This path sometimes exists in CUDA images as a file/symlink. Ensure it's a real directory.
+RUN if [ -L /workspace ] || [ -f /workspace ]; then rm -f /workspace; fi && mkdir -p /workspaceAlso, move this RUN earlier in the stage (before the first WORKDIR/COPY to /workspace) to guarantee correctness across base images.
🏁 Script executed:
#!/bin/bash
rg -n --context 2 $'RUN .*rm -f /workspace' container/Dockerfile.sglangLength of output: 260
🏁 Script executed:
#!/bin/bash
rg -n --context 3 '/workspace' container/Dockerfile.sglangLength of output: 6467
Ensure /workspace is a real directory before any COPY/WORKDIR to avoid rm -f errors
The current RUN rm -f /workspace && mkdir -p /workspace (container/Dockerfile.sglang:364–366) will always error once /workspace exists as a directory (e.g., after WORKDIR or COPY). To fix:
• Move the removal step to immediately after FROM base AS ci_minimum (before any WORKDIR or COPY)
• Conditionally remove only a file or symlink, then recreate the directory
Suggested diff in container/Dockerfile.sglang:
FROM base AS ci_minimum
+# Ensure `/workspace` isn’t a stale file/symlink from the base image
+RUN if [ -L /workspace ] || [ -f /workspace ]; then rm -f /workspace; fi \
+ && mkdir -p /workspace
ENV DYNAMO_HOME=/workspace
ENV CARGO_TARGET_DIR=/workspace/target
WORKDIR /workspace
…
-# This path exists from the base image, maintained by the CUDA team. Don't need it, so remove it.
-RUN rm -f /workspace && mkdir -p /workspace
-# Copy rest of the code
+COPY --from=wheel_builder /workspace /workspace
+… (rest of your COPY/RUN steps)This guarantees /workspace is a directory (never a broken symlink or file) before you start copying in artifacts.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In container/Dockerfile.sglang around lines 364 to 366, the command `rm -f
/workspace && mkdir -p /workspace` causes errors if /workspace is already a
directory. To fix this, move the removal step to immediately after the `FROM
base AS ci_minimum` line before any `WORKDIR` or `COPY` commands. Change the
removal to only delete /workspace if it is a file or symlink, then ensure
/workspace is created as a directory. This prevents errors and guarantees
/workspace is a proper directory before further operations.
| # This path exists from the base image, maintained by the CUDA team. Don't need it, so remove it. | ||
| RUN rm -f /workspace && mkdir -p /workspace | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
rm -f will error on directories; position should be before any /workspace usage
- rm -f fails when /workspace is a directory (common in these stages after earlier COPY/WORKDIR).
- If /workspace is a file/symlink in the base image, removal should precede the first WORKDIR/COPY to /workspace in this stage.
Update to a safe conditional and ensure the directory exists:
-# This path exists from the base image, maintained by the CUDA team. Don't need it, so remove it.
-RUN rm -f /workspace && mkdir -p /workspace
+# This path sometimes exists in CUDA images as a file/symlink. Ensure it's a real directory.
+RUN if [ -L /workspace ] || [ -f /workspace ]; then rm -f /workspace; fi && mkdir -p /workspaceRecommend moving this block to just before any instruction in this stage that targets /workspace (before COPY/WORKDIR) to avoid ordering hazards.
🏁 Script executed:
#!/bin/bash
rg -n --context 2 $'RUN .*rm -f /workspace' container/Dockerfile.tensorrt_llmLength of output: 266
🏁 Script executed:
#!/bin/bash
# Show context around the problematic RUN at lines 320-323
sed -n '280,360p' container/Dockerfile.tensorrt_llmLength of output: 2568
Correct /workspace removal ordering and guard in dev stage of container/Dockerfile.tensorrt_llm
- The current
RUN rm -f /workspace && mkdir -p /workspaceis placed afterWORKDIRand aCOPY --from=wheel_builder /workspace, so:rm -f /workspacewill error if/workspaceis already a directory.- It also wipes out the wheels and build artifacts just copied from the builder stage.
- Move this block to before any
WORKDIR/COPYtargeting/workspace, and wrap the removal in a conditional that only deletes files or symlinks.
Apply:
FROM build AS dev
- WORKDIR /workspace
+ # Ensure /workspace is a real directory (not a symlink or file from the base image)
+ RUN if [ -L /workspace ] || [ -f /workspace ]; then rm -f /workspace; fi && mkdir -p /workspace
+ WORKDIR /workspace
- COPY --from=wheel_builder /workspace /workspace
- COPY --from=wheel_builder $CARGO_HOME $CARGO_HOME
+ # Copy built wheels and Cargo cache from the builder stage
+ COPY --from=wheel_builder /workspace /workspace
+ COPY --from=wheel_builder $CARGO_HOME $CARGO_HOME
- # This path exists from the base image, maintained by the CUDA team. Don't need it, so remove it.
- RUN rm -f /workspace && mkdir -p /workspace
- # Copy rest of the code
- COPY . /workspace
+ # Copy the rest of the source
+ COPY . /workspace📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # This path exists from the base image, maintained by the CUDA team. Don't need it, so remove it. | |
| RUN rm -f /workspace && mkdir -p /workspace | |
| FROM build AS dev | |
| # Ensure /workspace is a real directory (not a symlink or file from the base image) | |
| RUN if [ -L /workspace ] || [ -f /workspace ]; then rm -f /workspace; fi && mkdir -p /workspace | |
| WORKDIR /workspace | |
| # Copy built wheels and Cargo cache from the builder stage | |
| COPY --from=wheel_builder /workspace /workspace | |
| COPY --from=wheel_builder $CARGO_HOME $CARGO_HOME | |
| # Copy the rest of the source | |
| COPY . /workspace |
🤖 Prompt for AI Agents
In container/Dockerfile.tensorrt_llm around lines 321 to 323, the command `rm -f
/workspace && mkdir -p /workspace` is incorrectly placed after `WORKDIR` and
`COPY` commands that use `/workspace`, causing errors if `/workspace` is a
directory and deleting copied build artifacts. To fix this, move this removal
and directory creation block to before any `WORKDIR` or `COPY` commands
targeting `/workspace`. Also, change the removal command to conditionally delete
only if `/workspace` is a file or symlink, avoiding errors when it is a
directory.
| # This path exists from the base image, maintained by the CUDA team. Don't need it, so remove it. | ||
| RUN rm -f /workspace && mkdir -p /workspace | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
rm -f on /workspace will fail if it’s already a directory; also placed too late in the stage
- Using rm -f against a directory errors (rm: cannot remove '/workspace': Is a directory), breaking the build since /workspace is already made a directory earlier via COPY/WORKDIR in this stage.
- If it were a file/symlink, this deletion should happen before the first COPY/WORKDIR that uses /workspace, not after prior COPYs.
Fix the command to only delete when /workspace is a file or symlink, then ensure the path exists:
-# This path exists from the base image, maintained by the CUDA team. Don't need it, so remove it.
-RUN rm -f /workspace && mkdir -p /workspace
+# This path sometimes exists in CUDA images as a file/symlink. Ensure it's a real directory.
+RUN if [ -L /workspace ] || [ -f /workspace ]; then rm -f /workspace; fi && mkdir -p /workspaceAdditionally, consider moving this RUN to before the first instruction in this stage that references /workspace (ideally before WORKDIR/COPY to /workspace) so we guarantee correctness regardless of the base image state.
You can quickly check all placements with:
🏁 Script executed:
#!/bin/bash
rg -n --context 2 $'RUN .*rm -f /workspace' container/Dockerfile.vllmLength of output: 258
🏁 Script executed:
#!/bin/bash
# Examine the stage up to the removal command for references to /workspace
echo "=== WORKDIR instructions before line 392 ==="
sed -n '1,392p' container/Dockerfile.vllm | grep -n 'WORKDIR /workspace' || echo "None found"
echo ""
echo "=== COPY instructions targeting /workspace before line 392 ==="
sed -n '1,392p' container/Dockerfile.vllm | grep -n -E 'COPY .* /workspace' || echo "None found"Length of output: 886
🏁 Script executed:
#!/bin/bash
# List all stage start lines
echo "=== Stage boundaries ==="
grep -n '^FROM' container/Dockerfile.vllm
# Show context around removal to understand the stage
echo ""
echo "=== Context lines 360-430 ==="
sed -n '360,430p' container/Dockerfile.vllmLength of output: 2743
Use a recursive remove and relocate the cleanup before any COPY in the ci_minimum stage
- In
container/Dockerfile.vllm, changerm -ftorm -rfso directories and their contents are removed without error. - Move the
RUNthat wipes/workspaceto immediately afterWORKDIR /workspace—before anyCOPY --from=wheel_builder—to clear out the base image’s artifacts but preserve the wheels you pull in next.
Affected file:
- container/Dockerfile.vllm (ci_minimum stage)
Suggested diff:
FROM base AS ci_minimum
ENV DYNAMO_HOME=/workspace
ENV CARGO_TARGET_DIR=/workspace/target
WORKDIR /workspace
-# This path exists from the base image, maintained by the CUDA team. Don't need it, so remove it.
-RUN rm -f /workspace && mkdir -p /workspace
+# Clean out any stale mount from the CUDA base image
+RUN rm -rf /workspace && mkdir -p /workspace
# Pull in the built wheels and cache
COPY --from=wheel_builder /workspace /workspace
COPY --from=wheel_builder $NIXL_PREFIX $NIXL_PREFIX
COPY --from=wheel_builder $CARGO_HOME $CARGO_HOME
# Copy the rest of the application source
COPY . /workspace🤖 Prompt for AI Agents
In container/Dockerfile.vllm around lines 393 to 395 within the ci_minimum
stage, change the command from 'rm -f /workspace' to 'rm -rf /workspace' to
ensure directories and their contents are removed recursively without errors.
Additionally, move this RUN command to immediately after the 'WORKDIR
/workspace' instruction and before any 'COPY --from=wheel_builder' commands to
properly clean up the base image's /workspace directory while preserving the
wheels copied later.
Summary by CodeRabbit
/workspacedirectory initialization across multiple container configurations.