-
Notifications
You must be signed in to change notification settings - Fork 285
rocm: reduce image size by using a multi-stage build #2246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Reviewer's guide (collapsed on small PRs)Reviewer's GuideConverts the ROCm container image to a multi-stage build that installs llama/whisper binaries into a slim runtime image under /tmp/install, adjusts the build script to use a temporary install prefix for ROCm and to use GPU_TARGETS instead of AMDGPU_TARGETS, and removes some redundant install logic for whisper and llama binaries. File-Level Changes
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Summary of ChangesHello @mikebonnet, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a multi-stage build strategy for the ROCm container image. The core purpose is to dramatically decrease the final image's footprint by isolating the build artifacts and only transferring the essential components to the production-ready image. This optimization ensures that the container remains lean while still providing full functionality for ROCm-accelerated applications. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes and found some issues that need to be addressed.
- In
clone_and_build_llama_cpp,install_prefixis no longer defined after removing the local assignment, but it’s still used in the finalinstallcommand, which will cause the script to fail or use an unintended value—either reintroduce a local assignment fromset_install_prefixor adjust the install path accordingly. - The HIP build flags changed from
-DAMDGPU_TARGETS=...to-DGPU_TARGETS=...; please confirm this matches the current llama.cpp/whisper.cpp CMake option names, as this rename may be ignored or break target selection if the upstream variable is stillAMDGPU_TARGETS.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `clone_and_build_llama_cpp`, `install_prefix` is no longer defined after removing the local assignment, but it’s still used in the final `install` command, which will cause the script to fail or use an unintended value—either reintroduce a local assignment from `set_install_prefix` or adjust the install path accordingly.
- The HIP build flags changed from `-DAMDGPU_TARGETS=...` to `-DGPU_TARGETS=...`; please confirm this matches the current llama.cpp/whisper.cpp CMake option names, as this rename may be ignored or break target selection if the upstream variable is still `AMDGPU_TARGETS`.
## Individual Comments
### Comment 1
<location> `container-images/scripts/build_llama_and_whisper.sh:246-245` </location>
<code_context>
clone_and_build_llama_cpp() {
local llama_cpp_commit="${LLAMA_CPP_PULL_REF:-$DEFAULT_LLAMA_CPP_COMMIT}"
- local install_prefix
- install_prefix=$(set_install_prefix)
git_clone_specific_commit "${LLAMA_CPP_REPO:-https://github.com/ggml-org/llama.cpp}" "$llama_cpp_commit"
cmake_steps "${common_flags[@]}"
install -m 755 build/bin/rpc-server "$install_prefix"/bin/rpc-server
</code_context>
<issue_to_address>
**issue (bug_risk):** Removing the local install_prefix assignment risks installing rpc-server into an unintended path.
This function still calls `install ... "$install_prefix"/bin/rpc-server`, but `install_prefix` is no longer initialized here. Unless it’s guaranteed to be set in the environment, this will likely resolve to `/bin/rpc-server` or fail. Either restore `local install_prefix=$(set_install_prefix)` or remove this manual `install` in favor of `cmake --install` with the prefix configured in `cmake_steps`.
</issue_to_address>
### Comment 2
<location> `container-images/rocm/Containerfile:18` </location>
<code_context>
+ /tmp/install/bin/llama-quantize \
+ /tmp/install/bin/whisper-server \
+ /usr/bin/
+COPY --from=builder /tmp/install/lib64/*.so /usr/lib64/
+
+RUN dnf -y --setopt=install_weak_deps=false install hipblas rocblas rocm-hip rocm-runtime rocsolver && \
</code_context>
<issue_to_address>
**suggestion:** The COPY glob for shared libraries can fail if no .so files exist or miss versioned symlinks.
To avoid this fragility, either copy the entire directory (e.g., `COPY --from=builder /tmp/install/lib64/ /usr/lib64/`) or use a broader pattern such as `*.so*` so that all ROCm shared libraries and their symlinks are reliably included.
```suggestion
COPY --from=builder /tmp/install/lib64/ /usr/lib64/
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
The code changes refactor the ROCm container build to a multi-stage Dockerfile, separating the llama.cpp and whisper.cpp compilation into a builder stage and then copying specific binaries and shared libraries, along with installing ROCm runtime dependencies, into the final image. The build script was updated to correctly configure the install prefix and GPU targets for ROCm. A review comment points out that the rpc-server binary is not being copied to the final image and suggests using a wildcard in the COPY command for binaries to ensure all executables are included robustly.
Only copy required binaries and libraries from the installation directory into the final image, and install only necessary runtime dependencies. The final image size is reduced by over 2Gb. Signed-off-by: Mike Bonnet <[email protected]>
531b152 to
3fbfc50
Compare
Override the use of "uv", so python dependencies get installed into system directories. Signed-off-by: Mike Bonnet <[email protected]>
3fbfc50 to
532e5d0
Compare
Signed-off-by: Mike Bonnet <[email protected]>
|
/retest ramalama-on-pull-request |
|
LGTM, assuming you tested this on an AMD system? |
|
Unfortunately I don't have a AMD system to test with. If someone could test |
|
Well i have a amd laptop, and it fails the same way with your image as the quay.io/ramalama/ramalama image, luckily for me, the laptop works with vulkan driver. |
|
.Memory critical error by agent node-0 (Agent handle: 0x3390320) on address 0x7f599f65e000. Reason: Memory in use. |
|
Does the quay.io/ramalama/rocm:0.15 image behave the same way? |
|
Yes |
|
There were reports that it was a firmware problem. Have you tried with different/older firmware? |
I'm almost done setting up an AMD system so can test this soon, tomorrow hopefully |
Only copy required binaries and libraries from the installation directory into the final image, and install only necessary runtime dependencies. The final image size is reduced by over 2Gb.
Summary by Sourcery
Use a multi-stage build for the ROCm container image and adjust build scripts to support installation into a temporary prefix for ROCm builds.
Enhancements: