rocm: reduce image size by using a multi-stage build #2246

mikebonnet · 2025-12-15T18:24:54Z

Only copy required binaries and libraries from the installation directory into the final image, and install only necessary runtime dependencies. The final image size is reduced by over 2Gb.

Summary by Sourcery

Use a multi-stage build for the ROCm container image and adjust build scripts to support installation into a temporary prefix for ROCm builds.

Enhancements:

Refactor the ROCm Containerfile to use a builder stage and produce a slimmer runtime image by copying only required binaries and shared libraries.
Update the build_llama_and_whisper.sh script to treat ROCm like other GPU-specific images for install prefix handling and to use a unified GPU targets flag instead of AMDGPU-specific targets.

sourcery-ai · 2025-12-15T18:25:01Z

Reviewer's guide (collapsed on small PRs)

Reviewer's Guide

Converts the ROCm container image to a multi-stage build that installs llama/whisper binaries into a slim runtime image under /tmp/install, adjusts the build script to use a temporary install prefix for ROCm and to use GPU_TARGETS instead of AMDGPU_TARGETS, and removes some redundant install logic for whisper and llama binaries.

File-Level Changes

Change	Details	Files
Convert ROCm Containerfile to a multi-stage build and slim runtime image.	Change base stage to a named builder stage using Fedora 43 and run the existing build_llama_and_whisper.sh rocm script there. Introduce a second Fedora 43 runtime stage that copies only selected llama and whisper binaries from /tmp/install/bin into /usr/bin. Copy only shared libraries from /tmp/install/lib64 into /usr/lib64 in the final image. Install only required ROCm runtime packages (hipblas, rocblas, rocm-hip, rocm-runtime, rocsolver) with weak dependencies disabled, then clean dnf caches.	`container-images/rocm/Containerfile`
Align build script install prefix and flags for ROCm multi-stage layout and simplify install steps.	Extend set_install_prefix to use /tmp/install for the rocm containerfile, matching other GPU-specific images. Change ROCm CMake flag from AMDGPU_TARGETS to GPU_TARGETS, preserving the default list of architectures. Remove explicit mkdir of the install_prefix/bin directory from the Whisper build function (now handled by cmake install or prior steps). Remove redundant local install_prefix variable and its use in llama.cpp clone_and_build_llama_cpp, relying instead on cmake install behavior or centralized prefix logic.	`container-images/scripts/build_llama_and_whisper.sh`

Possibly linked issues

#rocm-ubi container build fails around dnf remove gcc-c++: PR’s multi-stage rocm build avoids the problematic dnf remove step that caused the original build failure.
#ramalama new rocm image is broken due to missing libhipblas.so.2: They both address the broken ROCm image; PR adds hipblas/ROCm runtimes via multi-stage build, fixing missing libhipblas.so.2.

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

gemini-code-assist · 2025-12-15T18:25:15Z

Summary of Changes

Hello @mikebonnet, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a multi-stage build strategy for the ROCm container image. The core purpose is to dramatically decrease the final image's footprint by isolating the build artifacts and only transferring the essential components to the production-ready image. This optimization ensures that the container remains lean while still providing full functionality for ROCm-accelerated applications.

Highlights

Multi-stage build implementation: The ROCm Containerfile has been refactored to utilize a multi-stage build process, separating the build environment from the final runtime image.
Significant image size reduction: By selectively copying only the necessary binaries and libraries from the build stage and installing minimal runtime dependencies, the final image size is reduced by over 2GB.
Optimized dependency management: The final image now installs only essential ROCm runtime dependencies using dnf with install_weak_deps=false.
Build script adjustments: The build_llama_and_whisper.sh script was updated to correctly set the installation prefix to /tmp/install for ROCm builds, align with the multi-stage approach, and include a variable name change from AMDGPU_TARGETS to GPU_TARGETS.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

sourcery-ai

Hey there - I've reviewed your changes and found some issues that need to be addressed.

In clone_and_build_llama_cpp, install_prefix is no longer defined after removing the local assignment, but it’s still used in the final install command, which will cause the script to fail or use an unintended value—either reintroduce a local assignment from set_install_prefix or adjust the install path accordingly.
The HIP build flags changed from -DAMDGPU_TARGETS=... to -DGPU_TARGETS=...; please confirm this matches the current llama.cpp/whisper.cpp CMake option names, as this rename may be ignored or break target selection if the upstream variable is still AMDGPU_TARGETS.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- In `clone_and_build_llama_cpp`, `install_prefix` is no longer defined after removing the local assignment, but it’s still used in the final `install` command, which will cause the script to fail or use an unintended value—either reintroduce a local assignment from `set_install_prefix` or adjust the install path accordingly.
- The HIP build flags changed from `-DAMDGPU_TARGETS=...` to `-DGPU_TARGETS=...`; please confirm this matches the current llama.cpp/whisper.cpp CMake option names, as this rename may be ignored or break target selection if the upstream variable is still `AMDGPU_TARGETS`.

## Individual Comments

### Comment 1
<location> `container-images/scripts/build_llama_and_whisper.sh:246-245` </location>
<code_context>

 clone_and_build_llama_cpp() {
   local llama_cpp_commit="${LLAMA_CPP_PULL_REF:-$DEFAULT_LLAMA_CPP_COMMIT}"
-  local install_prefix
-  install_prefix=$(set_install_prefix)
   git_clone_specific_commit "${LLAMA_CPP_REPO:-https://github.com/ggml-org/llama.cpp}" "$llama_cpp_commit"
   cmake_steps "${common_flags[@]}"
   install -m 755 build/bin/rpc-server "$install_prefix"/bin/rpc-server
</code_context>

<issue_to_address>
**issue (bug_risk):** Removing the local install_prefix assignment risks installing rpc-server into an unintended path.

This function still calls `install ... "$install_prefix"/bin/rpc-server`, but `install_prefix` is no longer initialized here. Unless it’s guaranteed to be set in the environment, this will likely resolve to `/bin/rpc-server` or fail. Either restore `local install_prefix=$(set_install_prefix)` or remove this manual `install` in favor of `cmake --install` with the prefix configured in `cmake_steps`.
</issue_to_address>

### Comment 2
<location> `container-images/rocm/Containerfile:18` </location>
<code_context>
+  /tmp/install/bin/llama-quantize \
+  /tmp/install/bin/whisper-server \
+  /usr/bin/
+COPY --from=builder /tmp/install/lib64/*.so /usr/lib64/
+
+RUN dnf -y --setopt=install_weak_deps=false install hipblas rocblas rocm-hip rocm-runtime rocsolver && \
</code_context>

<issue_to_address>
**suggestion:** The COPY glob for shared libraries can fail if no .so files exist or miss versioned symlinks.

To avoid this fragility, either copy the entire directory (e.g., `COPY --from=builder /tmp/install/lib64/ /usr/lib64/`) or use a broader pattern such as `*.so*` so that all ROCm shared libraries and their symlinks are reliably included.

```suggestion
COPY --from=builder /tmp/install/lib64/ /usr/lib64/
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

container-images/scripts/build_llama_and_whisper.sh

container-images/rocm/Containerfile

gemini-code-assist

Code Review

The code changes refactor the ROCm container build to a multi-stage Dockerfile, separating the llama.cpp and whisper.cpp compilation into a builder stage and then copying specific binaries and shared libraries, along with installing ROCm runtime dependencies, into the final image. The build script was updated to correctly configure the install prefix and GPU targets for ROCm. A review comment points out that the rpc-server binary is not being copied to the final image and suggests using a wildcard in the COPY command for binaries to ensure all executables are included robustly.

container-images/rocm/Containerfile

Only copy required binaries and libraries from the installation directory into the final image, and install only necessary runtime dependencies. The final image size is reduced by over 2Gb. Signed-off-by: Mike Bonnet <[email protected]>

Override the use of "uv", so python dependencies get installed into system directories. Signed-off-by: Mike Bonnet <[email protected]>

Signed-off-by: Mike Bonnet <[email protected]>

mikebonnet · 2025-12-16T00:36:52Z

/retest ramalama-on-pull-request

rhatdan · 2025-12-16T12:44:42Z

LGTM, assuming you tested this on an AMD system?

mikebonnet · 2025-12-16T16:06:31Z

Unfortunately I don't have a AMD system to test with. If someone could test ramalama run or ramalama serve with --image quay.io/redhat-user-workloads/ramalama-tenant/rocm:on-pr-34482d11f78e7c93649c4cddb070c285e355ac39 that would be very helpful.

rhatdan · 2025-12-16T16:14:52Z

Well i have a amd laptop, and it fails the same way with your image as the quay.io/ramalama/ramalama image, luckily for me, the laptop works with vulkan driver.

rhatdan · 2025-12-16T16:15:06Z

.Memory critical error by agent node-0 (Agent handle: 0x3390320) on address 0x7f599f65e000. Reason: Memory in use.

mikebonnet · 2025-12-16T16:16:13Z

Does the quay.io/ramalama/rocm:0.15 image behave the same way?

rhatdan · 2025-12-16T16:18:29Z

Yes

mikebonnet · 2025-12-16T16:49:51Z

There were reports that it was a firmware problem. Have you tried with different/older firmware?

olliewalsh · 2025-12-16T23:13:53Z

Unfortunately I don't have a AMD system to test with. If someone could test ramalama run or ramalama serve with --image quay.io/redhat-user-workloads/ramalama-tenant/rocm:on-pr-34482d11f78e7c93649c4cddb070c285e355ac39 that would be very helpful.

I'm almost done setting up an AMD system so can test this soon, tomorrow hopefully

mikebonnet requested review from bmahabirbu, cgruver, engelmi, jhjaggars, maxamillion, rhatdan and swarajpande5 as code owners December 15, 2025 18:24

sourcery-ai bot reviewed Dec 15, 2025

View reviewed changes

container-images/scripts/build_llama_and_whisper.sh Show resolved Hide resolved

container-images/rocm/Containerfile Show resolved Hide resolved

gemini-code-assist bot reviewed Dec 15, 2025

View reviewed changes

container-images/rocm/Containerfile Show resolved Hide resolved

mikebonnet force-pushed the rocm-multi-stage branch from 531b152 to 3fbfc50 Compare December 15, 2025 18:38

bats: install dependencies with system pip

532e5d0

Override the use of "uv", so python dependencies get installed into system directories. Signed-off-by: Mike Bonnet <[email protected]>

mikebonnet force-pushed the rocm-multi-stage branch from 3fbfc50 to 532e5d0 Compare December 15, 2025 19:04

056-artifact.bats: fix artifact reference

34482d1

Signed-off-by: Mike Bonnet <[email protected]>

rhatdan approved these changes Dec 16, 2025

View reviewed changes

rocm: reduce image size by using a multi-stage build #2246

Are you sure you want to change the base?

rocm: reduce image size by using a multi-stage build #2246

Conversation

mikebonnet commented Dec 15, 2025 • edited by sourcery-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Sourcery

Uh oh!

sourcery-ai bot commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

File-Level Changes

Possibly linked issues

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

gemini-code-assist bot commented Dec 15, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

mikebonnet commented Dec 16, 2025

Uh oh!

rhatdan commented Dec 16, 2025

Uh oh!

mikebonnet commented Dec 16, 2025

Uh oh!

rhatdan commented Dec 16, 2025

Uh oh!

rhatdan commented Dec 16, 2025

Uh oh!

mikebonnet commented Dec 16, 2025

Uh oh!

rhatdan commented Dec 16, 2025

Uh oh!

mikebonnet commented Dec 16, 2025

Uh oh!

olliewalsh commented Dec 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mikebonnet commented Dec 15, 2025 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Dec 15, 2025 •

edited

Loading