Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
992adfb
fix: add better port logic (#2175) (#2192)
alec-flowers Jul 30, 2025
9a93f11
chore: fix install (#2191)
ishandhanani Jul 30, 2025
2a616da
chore: fix QA bugs in documentation/readmes (#2199)
athreesh Jul 30, 2025
d0de1a0
feat: Add trtllm deploy examples for k8s #2133 (#2207)
biswapanda Jul 31, 2025
edccbd5
fix(sglang): disagg yaml worker change and agg kv router fix (#2205)
ishandhanani Jul 31, 2025
54fbff3
fix: add curl and jq for health checks #2203 (#2209)
biswapanda Jul 31, 2025
a9b6b28
fix: Kprashanth/trtllm rc4 cherry pick (#2218)
KrishnanPrash Jul 31, 2025
65e89b3
chore: cleanup dead links (#2208)
nealvaidya Jul 31, 2025
c92dc98
chore: update nixl version to 0.4.1 (#2221) (#2228)
nv-anants Jul 31, 2025
eb58916
chore: Remove multimodal readme. (#2212) (#2234)
krishung5 Jul 31, 2025
e848cf5
fix: Cherry pick pr 2186 release 0.4.0 to fix docs/runtime/README.md …
keivenchang Aug 1, 2025
5e3586d
fix: drop cuda graph bs (batch size) on dsr1 h100 sgl (#2235)
ishandhanani Aug 1, 2025
4fbb4e5
fix: handle groveTerminationDelay and auto-detect grove installation …
julienmancuso Aug 1, 2025
dc13774
fix: Locked triton==3.3.1 since triton 3.4.0 breaks tensorrt-llm 1.0.…
dmitry-tokarev-nv Aug 1, 2025
e5e94ad
fix: sgl instructions point to new frontend (#2245)
ishandhanani Aug 1, 2025
92781d3
fix: Update disagg configs for trtllm 1.0.0rc4 changes (release/0.4.0…
rmccorm4 Aug 4, 2025
58ad4a2
fix: readme instruction (#2265)
ishandhanani Aug 4, 2025
039c061
fix: Update eagle_one configs with speculative_model_dir field (#2283)
rmccorm4 Aug 4, 2025
2a8e251
docs: Backport: Dyn 591 (#2247) to 0.4.0 (#2251)
atchernych Aug 4, 2025
2dc4a4b
fix: trtllm container - ENV var used before declaration (#2277)
dmitry-tokarev-nv Aug 5, 2025
85737ba
fix: Update the NIXL TRTLLM commit version to rc4 (#2285)
tanmayv25 Aug 5, 2025
27c8a97
docs: add instruction to deploy model with inference gateway #2257 (#…
biswapanda Aug 5, 2025
641e49d
fix: fix nil pointer deref in dynamo controller (#2293) (#2299)
mohammedabdulwahhab Aug 5, 2025
1b145bb
fix: fix broken doc links (#2308)
biswapanda Aug 5, 2025
4e4818f
fix: Copy cuda libraries from devel to runtime stage (#2298)
nv-tusharma Aug 5, 2025
c92c1f4
docs: update deploy readme (#2306)
atchernych Aug 5, 2025
6fce98a
fix: Add common and test dependencies to sglang runtime build (#2279)…
nv-tusharma Aug 5, 2025
035d6d8
fix: Revert the commit for DeepGEMM to fix vLLM WideEP (#2302) (#2325)
krishung5 Aug 6, 2025
167c793
fix: Backport/anish index rst into 0.4.0 - fix links in docs and more…
athreesh Aug 6, 2025
409aa9e
docs: Final fixes to links reported by QA (#2334)
athreesh Aug 6, 2025
71126c7
fix: nil pointer deref in dynamo controller (#2335)
mohammedabdulwahhab Aug 6, 2025
f342c30
docs: address sphinx build errors for docs.nvidia.com (#2346)
athreesh Aug 7, 2025
96d1f15
docs: Address vincent issue with trtllm symlink (#2351)
athreesh Aug 7, 2025
e8b37a6
fix: ARM Flashinfer Versioning for 0.4.0 Release (#2363)
zaristei Aug 8, 2025
b5c9278
fix: Pinned PyTorch version for vLLM container (#2356)
krishung5 Aug 8, 2025
b0c1a24
chore: ATTRIBUTIONS-Go.md (#2355)
dmitry-tokarev-nv Aug 8, 2025
0cf8041
Revert "adjust tag to accomodate flashinfer versioning typo" (#2364)
zaristei Aug 8, 2025
bd8e368
fix: use wheel files for installation in trtllm build (#2372) (#2375)
nv-anants Aug 8, 2025
73bcc3b
fix(build): Pin cuda-python>=12,<13 to avoid trtllm breakage (#2379)
rmccorm4 Aug 8, 2025
aa57c6b
fix: turn off kvbm for al2023 support (#2533)
saturley-hall Aug 21, 2025
3f0a725
docs: add trtllm known issue for al2023 (#2604) (#2612)
nv-anants Aug 21, 2025
d98a791
docs: update trtllm know issue message (#2639) (#2643)
nv-anants Aug 22, 2025
37fca1c
fix: prevent crash looping hello world (#2625)
biswapanda Aug 22, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
fix: turn off kvbm for al2023 support (#2533)
  • Loading branch information
saturley-hall authored Aug 21, 2025
commit aa57c6b4b2e8a2ce049ffb1d1cae31185f1b13ad
18 changes: 9 additions & 9 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ members = [
resolver = "3"

[workspace.package]
version = "0.4.0"
version = "0.4.0+post0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Workspace version bump may desync local crate versions

Bumping [workspace.package].version to 0.4.0+post0 while workspace.dependencies pin local crates at 0.4.0 can cause version mismatches if member crates inherit the workspace version or if any member’s package.version is updated without updating these dependency constraints.

Choose one consistent approach:

  • Option A (quick): keep workspace at 0.4.0 for now.
  • Option B (preferred): bump local crate versions and dependency pins to 0.4.0+post0.

If going with Option B, update the local workspace dependency versions:

 [workspace.dependencies]
 # Local crates
-dynamo-runtime = { path = "lib/runtime", version = "0.4.0" }
-dynamo-llm = { path = "lib/llm", version = "0.4.0" }
-dynamo-tokens = { path = "lib/tokens", version = "0.4.0" }
+dynamo-runtime = { path = "lib/runtime", version = "0.4.0+post0" }
+dynamo-llm = { path = "lib/llm", version = "0.4.0+post0" }
+dynamo-tokens = { path = "lib/tokens", version = "0.4.0+post0" }

I can also generate a script to scan member crates for version.workspace = true and align everything if helpful.


🏁 Script executed:

#!/bin/bash
set -eux

# Inspect root Cargo.toml for workspace.package version
echo "===== [workspace.package] block in root Cargo.toml ====="
sed -n -n '/^\[workspace.package\]/,/^\[/{p}' Cargo.toml || true

# Inspect root Cargo.toml for workspace.dependencies block
echo -e "\n===== [workspace.dependencies] block in root Cargo.toml ====="
sed -n -n '/^\[workspace.dependencies\]/,/^\[/{p}' Cargo.toml || true

# Inspect member crate Cargo.toml files
for crate in lib/runtime lib/llm lib/tokens; do
  echo -e "\n===== $crate/Cargo.toml ====="
  if [ -f "$crate/Cargo.toml" ]; then
    # Show [package] block
    echo "---- [package] block ----"
    sed -n -n '/^\[package\]/,/^\[/{p}' "$crate/Cargo.toml"
    # Show [dependencies] block
    echo "---- [dependencies] block ----"
    sed -n -n '/^\[dependencies\]/,/^\[/{p}' "$crate/Cargo.toml"
  else
    echo "$crate/Cargo.toml not found"
  fi
done

Length of output: 8765


Action Required: Align local crate dependency versions with the bumped workspace version

The root Cargo.toml now sets

[workspace.package]
version = "0.4.0+post0"

but under

[workspace.dependencies]

the local crates are still pinned at 0.4.0. Since each member crate uses version.workspace = true, their actual version becomes 0.4.0+post0, which will conflict with the pinned 0.4.0 in workspace.dependencies. You have two choices:

  • Option A: Revert the workspace version bump back to "0.4.0".
  • Option B (preferred): Update the local crate entries in [workspace.dependencies] to match 0.4.0+post0.

If you proceed with Option B, apply this diff in root Cargo.toml:

 [workspace.dependencies]
 # Local crates
-dynamo-runtime = { path = "lib/runtime", version = "0.4.0" }
-dynamo-llm      = { path = "lib/llm",     version = "0.4.0" }
-dynamo-tokens   = { path = "lib/tokens",  version = "0.4.0" }
+dynamo-runtime = { path = "lib/runtime", version = "0.4.0+post0" }
+dynamo-llm      = { path = "lib/llm",     version = "0.4.0+post0" }
+dynamo-tokens   = { path = "lib/tokens",  version = "0.4.0+post0" }

• Root manifest: Cargo.toml
• Lines: update the three dynamo-* entries under [workspace.dependencies]

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
version = "0.4.0+post0"
[workspace.dependencies]
# Local crates
dynamo-runtime = { path = "lib/runtime", version = "0.4.0+post0" }
dynamo-llm = { path = "lib/llm", version = "0.4.0+post0" }
dynamo-tokens = { path = "lib/tokens", version = "0.4.0+post0" }
🤖 Prompt for AI Agents
In Cargo.toml around line 18, the workspace package version was bumped to
"0.4.0+post0" but the local crate entries under [workspace.dependencies] remain
pinned to "0.4.0", causing a version mismatch; update the three dynamo-* entries
under [workspace.dependencies] to use version "0.4.0+post0" (or alternatively
revert the workspace.package.version to "0.4.0" if you prefer Option A).

edition = "2021"
description = "Dynamo Inference Framework"
authors = ["NVIDIA Inc. <[email protected]>"]
Expand Down
280 changes: 280 additions & 0 deletions container/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,280 @@
# SPDX-FileCopyrightText: Copyright (c) 2024-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0

ARG BASE_IMAGE="nvcr.io/nvidia/cuda-dl-base"
# TODO OPS-612: NCCL will hang with 25.03, so use 25.01 for now
# Please check https://github.com/ai-dynamo/dynamo/pull/1065
# for details and reproducer to manually test if the image
# can be updated to later versions.
ARG BASE_IMAGE_TAG="25.01-cuda12.8-devel-ubuntu24.04"
ARG RELEASE_BUILD=false
ARG ENABLE_KVBM=false

# Define general architecture ARGs for supporting both x86 and aarch64 builds.
# ARCH: Used for package suffixes (e.g., amd64, arm64)
# ARCH_ALT: Used for Rust targets, manylinux suffix (e.g., x86_64, aarch64)
#
# Default values are for x86/amd64:
# --build-arg ARCH=amd64 --build-arg ARCH_ALT=x86_64
#
# For arm64/aarch64, build with:
# --build-arg ARCH=arm64 --build-arg ARCH_ALT=aarch64
#TODO OPS-592: Leverage uname -m to determine ARCH instead of passing it as an arg
ARG ARCH=amd64
ARG ARCH_ALT=x86_64


##################################
########## Base Image ############
##################################

FROM ${BASE_IMAGE}:${BASE_IMAGE_TAG} AS base

# Redeclare ARCH and ARCH_ALT so they're available in this stage
ARG ARCH
ARG ARCH_ALT
ARG CARGO_BUILD_JOBS

ARG NIXL_UCX_REF=v1.19.x
ARG NIXL_REF=0.4.1

# Environment variables for NIXL
ENV NIXL_SRC_DIR=/opt/nixl \
NIXL_PREFIX=/opt/nvidia/nvda_nixl \
NIXL_LIB_DIR=/opt/nvidia/nvda_nixl/lib/${ARCH_ALT}-linux-gnu \
NIXL_PLUGIN_DIR=/opt/nvidia/nvda_nixl/lib/${ARCH_ALT}-linux-gnu/plugins

USER root
ARG PYTHON_VERSION=3.12

COPY --from=ghcr.io/astral-sh/uv:latest /uv /uvx /bin/

# Rust environment setup
ENV RUSTUP_HOME=/usr/local/rustup \
CARGO_HOME=/usr/local/cargo \
PATH=/usr/local/cargo/bin:$PATH \
RUST_VERSION=1.87.0

WORKDIR /opt/dynamo

# Define Rust target based on ARCH_ALT ARG
ARG RUSTARCH=${ARCH_ALT}-unknown-linux-gnu

# Install Rust using RUSTARCH derived from ARCH_ALT
RUN wget --tries=3 --waitretry=5 "https://static.rust-lang.org/rustup/archive/1.28.1/${RUSTARCH}/rustup-init" && \
# TODO OPS-591: Add SHA check back based on RUSTARCH
chmod +x rustup-init && \
./rustup-init -y --no-modify-path --profile minimal --default-toolchain $RUST_VERSION --default-host ${RUSTARCH} && \
rm rustup-init && \
chmod -R a+w $RUSTUP_HOME $CARGO_HOME

RUN apt-get update -y \
&& DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
# NIXL build dependencies
autoconf \
automake \
cmake \
git \
libtool \
meson \
net-tools \
ninja-build \
pybind11-dev \
# These headers are missing with the hpcx installer, required
# by UCX to find RDMA devices
ibverbs-providers \
ibverbs-utils \
libibumad-dev \
libibverbs-dev \
librdmacm-dev \
libnuma-dev \
rdma-core \
# Rust build dependencies
clang \
libclang-dev \
protobuf-compiler \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*

# Download external dependencies in parallel for better performance
ENV NATS_VERSION="v2.10.28"
RUN --mount=type=cache,target=/var/cache/apt \
wget --tries=3 --waitretry=5 https://github.com/nats-io/nats-server/releases/download/${NATS_VERSION}/nats-server-${NATS_VERSION}-${ARCH}.deb && \
dpkg -i nats-server-${NATS_VERSION}-${ARCH}.deb && rm nats-server-${NATS_VERSION}-${ARCH}.deb

ENV ETCD_VERSION="v3.5.21"
RUN wget --tries=3 --waitretry=5 https://github.com/etcd-io/etcd/releases/download/$ETCD_VERSION/etcd-$ETCD_VERSION-linux-${ARCH}.tar.gz -O /tmp/etcd.tar.gz && \
mkdir -p /usr/local/bin/etcd && \
tar -xvf /tmp/etcd.tar.gz -C /usr/local/bin/etcd --strip-components=1 && \
rm /tmp/etcd.tar.gz
ENV PATH=/usr/local/bin/etcd/:$PATH

### UCX EFA Setup ###
RUN rm -rf /opt/hpcx/ucx && \
rm -rf /usr/local/ucx && \
echo "Building UCX with reference $NIXL_UCX_REF" && \
cd /usr/local/src && \
git clone --depth 1 --branch $NIXL_UCX_REF https://github.com/openucx/ucx.git && \
cd ucx && \
./autogen.sh && \
./configure \
--prefix=/usr/local/ucx \
--enable-shared \
--disable-static \
--disable-doxygen-doc \
--enable-optimizations \
--enable-cma \
--enable-devel-headers \
--with-cuda=/usr/local/cuda \
--with-verbs \
--with-efa \
--with-dm \
--with-gdrcopy=/usr/local \
--enable-mt && \
make -j$(nproc) && \
make -j$(nproc) install-strip && \
echo "/usr/local/ucx/lib" > /etc/ld.so.conf.d/ucx.conf && \
echo "/usr/local/ucx/lib/ucx" >> /etc/ld.so.conf.d/ucx.conf && \
ldconfig && \
cd /usr/local/src && \
rm -rf ucx

# UCX environment variables
ENV CPATH=/usr/include:$CPATH \
PATH=/usr/bin:$PATH \
PKG_CONFIG_PATH=/usr/lib/pkgconfig:$PKG_CONFIG_PATH

### NIXL SETUP ###
# Clone nixl source with shallow clone for faster download
RUN git clone --depth 1 --branch ${NIXL_REF} "https://github.com/ai-dynamo/nixl.git" ${NIXL_SRC_DIR} && \
cd ${NIXL_SRC_DIR} && \
if [ "$ARCH" = "arm64" ]; then \
nixl_build_args="-Ddisable_gds_backend=true"; \
else \
nixl_build_args=""; \
fi && \
meson setup build/ --buildtype=release --prefix=$NIXL_PREFIX $nixl_build_args && \
ninja -C build/ -j$(nproc) && \
ninja -C build/ install && \
echo "$NIXL_LIB_DIR" > /etc/ld.so.conf.d/nixl.conf && \
echo "$NIXL_PLUGIN_DIR" >> /etc/ld.so.conf.d/nixl.conf && \
ldconfig

# Install NIXL Python module
# TODO OPS-590: Move gds_path selection based on arch into NIXL build and re-enable gds backend for arm64
RUN if [ "$ARCH" = "arm64" ]; then \
cd ${NIXL_SRC_DIR} && uv build . --out-dir /opt/dynamo/wheelhouse/nixl \
--config-settings=setup-args="-Ddisable_gds_backend=true"; \
else \
cd ${NIXL_SRC_DIR} && uv build . --out-dir /opt/dynamo/wheelhouse/nixl; \
fi

# Create virtual environment
RUN mkdir -p /opt/dynamo/venv && \
uv venv /opt/dynamo/venv --python 3.12

# Activate virtual environment
ENV VIRTUAL_ENV=/opt/dynamo/venv \
PATH="/opt/dynamo/venv/bin:${PATH}"

# Install common and test dependencies
RUN --mount=type=bind,source=./container/deps/requirements.txt,target=/tmp/requirements.txt \
--mount=type=bind,source=./container/deps/requirements.test.txt,target=/tmp/requirements.test.txt \
uv pip install --requirement /tmp/requirements.txt --requirement /tmp/requirements.test.txt

##################################
##### Wheel Build Image ##########
##################################

# Redeclare ARCH_ALT ARG so it's available for interpolation in the FROM instruction
ARG ARCH_ALT

FROM quay.io/pypa/manylinux_2_28_${ARCH_ALT} AS wheel_builder

ARG CARGO_BUILD_JOBS
# Set CARGO_BUILD_JOBS to 16 if not provided
# This is to prevent cargo from building $(nproc) jobs in parallel,
# which might exceed the number of opened files limit.
ENV CARGO_BUILD_JOBS=${CARGO_BUILD_JOBS:-16}
# Use build arg RELEASE_BUILD = true to generate wheels for Python 3.10, 3.11 and 3.12.
ARG RELEASE_BUILD
# Use arg ENABLE_KVBM = true to turn on the block-manager feature
ARG ENABLE_KVBM

WORKDIR /opt/dynamo

RUN dnf update -y \
&& dnf install -y llvm-toolset protobuf-compiler python3.12-devel \
&& dnf clean all \
&& rm -rf /var/cache/dnf

ENV RUSTUP_HOME=/usr/local/rustup \
CARGO_HOME=/usr/local/cargo \
CARGO_TARGET_DIR=/opt/dynamo/target \
VIRTUAL_ENV=/opt/dynamo/venv \
NIXL_PREFIX=/opt/nvidia/nvda_nixl

COPY --from=base $RUSTUP_HOME $RUSTUP_HOME
COPY --from=base $CARGO_HOME $CARGO_HOME
COPY --from=base $NIXL_PREFIX $NIXL_PREFIX
COPY --from=base $VIRTUAL_ENV $VIRTUAL_ENV
ENV PATH=$CARGO_HOME/bin:$VIRTUAL_ENV/bin:$PATH

# Copy configuration files first for better layer caching
COPY pyproject.toml README.md LICENSE Cargo.toml Cargo.lock rust-toolchain.toml /opt/dynamo/

# Copy source code
COPY lib/ /opt/dynamo/lib/
COPY components/ /opt/dynamo/components/

# Build dynamo wheel
RUN uv build --wheel --out-dir /opt/dynamo/dist && \
cd /opt/dynamo/lib/bindings/python && \
uv pip install maturin[patchelf] && \
if [ "$ENABLE_KVBM" = "true" ]; then \
maturin build --release --features block-manager --out /opt/dynamo/dist; \
else \
maturin build --release --out /opt/dynamo/dist; \
fi && \
if [ "$RELEASE_BUILD" = "true" ]; then \
# do not enable KVBM feature, ensure compatibility with lower glibc
uv run --python 3.11 maturin build --release --out /opt/dynamo/dist && \
uv run --python 3.10 maturin build --release --out /opt/dynamo/dist; \
fi

Comment on lines +192 to +244
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Wheel build matrix relies on uv run --python 3.11/3.10; ensure interpreters are available.

Manylinux images don’t ship multiple CPythons by default. uv can provision, but this path should be validated to avoid build-time surprises.


🏁 Script executed:

#!/bin/bash
# Validate uv can run with specified versions in the manylinux image
echo "Python versions visible:" && command -v python3.12 || true
uv --version
uv run --python 3.11 -c 'import sys; print(sys.version)'
uv run --python 3.10 -c 'import sys; print(sys.version)'

Length of output: 501


🏁 Script executed:

#!/bin/bash
set -eo pipefail
# Display the initial portion of the Dockerfile to inspect the 'base' stage setup
sed -n '1,200p' container/Dockerfile

# Search for any installation or invocation of 'uv' in the Dockerfile
grep -nE "pip install.*uv|uv" container/Dockerfile || true

Length of output: 7995


Wheel build will fail: missing uv CLI in wheel_builder stage

The wheel_builder stage copies Rust, NixL and the Python venv from the base stage but never brings in the uv (and uvx) binaries installed there. As a result, all uv build, uv pip and uv run commands will error out with “command not found.”

• In container/Dockerfile, immediately after:

FROM quay.io/pypa/manylinux_2_28_${ARCH_ALT} AS wheel_builder

add:

+ COPY --from=base /uv /uvx /bin/

• Verify that uv can provision Python 3.10 and 3.11 in an environment without a system interpreter (e.g. offline); if not, either pre-install those interpreters into your virtualenv or adjust the build to use only the CPython versions already present.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
FROM quay.io/pypa/manylinux_2_28_${ARCH_ALT} AS wheel_builder
ARG CARGO_BUILD_JOBS
# Set CARGO_BUILD_JOBS to 16 if not provided
# This is to prevent cargo from building $(nproc) jobs in parallel,
# which might exceed the number of opened files limit.
ENV CARGO_BUILD_JOBS=${CARGO_BUILD_JOBS:-16}
# Use build arg RELEASE_BUILD = true to generate wheels for Python 3.10, 3.11 and 3.12.
ARG RELEASE_BUILD
# Use arg ENABLE_KVBM = true to turn on the block-manager feature
ARG ENABLE_KVBM
WORKDIR /opt/dynamo
RUN dnf update -y \
&& dnf install -y llvm-toolset protobuf-compiler python3.12-devel \
&& dnf clean all \
&& rm -rf /var/cache/dnf
ENV RUSTUP_HOME=/usr/local/rustup \
CARGO_HOME=/usr/local/cargo \
CARGO_TARGET_DIR=/opt/dynamo/target \
VIRTUAL_ENV=/opt/dynamo/venv \
NIXL_PREFIX=/opt/nvidia/nvda_nixl
COPY --from=base $RUSTUP_HOME $RUSTUP_HOME
COPY --from=base $CARGO_HOME $CARGO_HOME
COPY --from=base $NIXL_PREFIX $NIXL_PREFIX
COPY --from=base $VIRTUAL_ENV $VIRTUAL_ENV
ENV PATH=$CARGO_HOME/bin:$VIRTUAL_ENV/bin:$PATH
# Copy configuration files first for better layer caching
COPY pyproject.toml README.md LICENSE Cargo.toml Cargo.lock rust-toolchain.toml /opt/dynamo/
# Copy source code
COPY lib/ /opt/dynamo/lib/
COPY components/ /opt/dynamo/components/
# Build dynamo wheel
RUN uv build --wheel --out-dir /opt/dynamo/dist && \
cd /opt/dynamo/lib/bindings/python && \
uv pip install maturin[patchelf] && \
if [ "$ENABLE_KVBM" = "true" ]; then \
maturin build --release --features block-manager --out /opt/dynamo/dist; \
else \
maturin build --release --out /opt/dynamo/dist; \
fi && \
if [ "$RELEASE_BUILD" = "true" ]; then \
# do not enable KVBM feature, ensure compatibility with lower glibc
uv run --python 3.11 maturin build --release --out /opt/dynamo/dist && \
uv run --python 3.10 maturin build --release --out /opt/dynamo/dist; \
fi
FROM quay.io/pypa/manylinux_2_28_${ARCH_ALT} AS wheel_builder
COPY --from=base /uv /uvx /bin/
ARG CARGO_BUILD_JOBS
# Set CARGO_BUILD_JOBS to 16 if not provided
# This is to prevent cargo from building $(nproc) jobs in parallel,
# which might exceed the number of opened files limit.
ENV CARGO_BUILD_JOBS=${CARGO_BUILD_JOBS:-16}
# Use build arg RELEASE_BUILD = true to generate wheels for Python 3.10, 3.11 and 3.12.
ARG RELEASE_BUILD
# Use arg ENABLE_KVBM = true to turn on the block-manager feature
ARG ENABLE_KVBM
WORKDIR /opt/dynamo
RUN dnf update -y \
&& dnf install -y llvm-toolset protobuf-compiler python3.12-devel \
&& dnf clean all \
&& rm -rf /var/cache/dnf
ENV RUSTUP_HOME=/usr/local/rustup \
CARGO_HOME=/usr/local/cargo \
CARGO_TARGET_DIR=/opt/dynamo/target \
VIRTUAL_ENV=/opt/dynamo/venv \
NIXL_PREFIX=/opt/nvidia/nvda_nixl
COPY --from=base $RUSTUP_HOME $RUSTUP_HOME
COPY --from=base $CARGO_HOME $CARGO_HOME
COPY --from=base $NIXL_PREFIX $NIXL_PREFIX
COPY --from=base $VIRTUAL_ENV $VIRTUAL_ENV
ENV PATH=$CARGO_HOME/bin:$VIRTUAL_ENV/bin:$PATH
# Copy configuration files first for better layer caching
COPY pyproject.toml README.md LICENSE Cargo.toml Cargo.lock rust-toolchain.toml /opt/dynamo/
# Copy source code
COPY lib/ /opt/dynamo/lib/
COPY components/ /opt/dynamo/components/
# Build dynamo wheel
RUN uv build --wheel --out-dir /opt/dynamo/dist && \
cd /opt/dynamo/lib/bindings/python && \
uv pip install maturin[patchelf] && \
if [ "$ENABLE_KVBM" = "true" ]; then \
maturin build --release --features block-manager --out /opt/dynamo/dist; \
else \
maturin build --release --out /opt/dynamo/dist; \
fi && \
if [ "$RELEASE_BUILD" = "true" ]; then \
# do not enable KVBM feature, ensure compatibility with lower glibc
uv run --python 3.11 maturin build --release --out /opt/dynamo/dist && \
uv run --python 3.10 maturin build --release --out /opt/dynamo/dist; \
fi
🤖 Prompt for AI Agents
In container/Dockerfile around lines 192-244, the wheel_builder stage runs uv
commands but never brings the uv/uvx CLI binaries from the base stage, causing
"command not found" failures; fix by copying the uv/uvx executables (wherever
they live in the base image, e.g. the base virtualenv's bin or /usr/local/bin)
into the wheel_builder image and ensure that location is on PATH (or
alternatively install uv in this stage); also verify uv can provision Python
3.10/3.11 offline—if it cannot, either pre-install those CPython interpreters
into the venv copied from base or change the build to use only interpreters
present in the image.

##############################################
########## Dev entrypoint image ##############
##############################################
FROM base AS dev

# Application environment variables
ENV DYNAMO_HOME=/opt/dynamo \
CARGO_TARGET_DIR=/opt/dynamo/target \
PYTHONPATH=/opt/dynamo:$PYTHONPATH

WORKDIR /opt/dynamo

COPY --from=wheel_builder /opt/dynamo/dist/*.whl /opt/dynamo/wheelhouse/
COPY --from=wheel_builder $CARGO_TARGET_DIR $CARGO_TARGET_DIR

# Copy Cargo cache to avoid re-downloading dependencies
COPY --from=wheel_builder $CARGO_HOME $CARGO_HOME

# Temporarily copy benchmarks folder for installation
COPY benchmarks/ /opt/dynamo/benchmarks/

# Install all python packages
RUN uv pip install \
/opt/dynamo/wheelhouse/ai_dynamo_runtime*cp312*.whl \
/opt/dynamo/wheelhouse/ai_dynamo*any.whl \
/opt/dynamo/wheelhouse/nixl/nixl*.whl \
/opt/dynamo/benchmarks && \
rm -rf /opt/dynamo/benchmarks

# Copy launch banner
RUN --mount=type=bind,source=./container/launch_message.txt,target=/opt/dynamo/launch_message.txt \
sed '/^#\s/d' /opt/dynamo/launch_message.txt > ~/.launch_screen && \
echo "cat ~/.launch_screen" >> ~/.bashrc

ENTRYPOINT ["/opt/nvidia/nvidia_entrypoint.sh"]
CMD []
Loading