justinstalled (brew install just)cmakeinstalled (brew install cmake)cargoinstalled (packaged with rust)ghCLI authenticated (gh auth status)- llama.cpp fork cloned (
just builddoes this automatically)
just buildOn macOS, this clones/updates the llama.cpp fork if needed, builds with -DGGML_METAL=ON -DGGML_RPC=ON -DBUILD_SHARED_LIBS=OFF -DLLAMA_OPENSSL=OFF, and builds the Rust mesh-llm binary. Linux release workflows build CPU, CUDA, ROCm, and Vulkan variants separately.
On Windows, use the release-specific recipes directly:
just release-build-windows
just release-build-cuda-windows
just release-build-amd-windows
just release-build-vulkan-windowsotool -L llama.cpp/build/bin/llama-server | grep -v /System | grep -v /usr/lib
otool -L llama.cpp/build/bin/rpc-server | grep -v /System | grep -v /usr/lib
otool -L target/release/mesh-llm | grep -v /System | grep -v /usr/libEach should only show the binary name — no /opt/homebrew/ paths.
just bundleCreates /tmp/mesh-bundle.tar.gz containing mesh-llm, flavor-specific llama.cpp runtime binaries, and llama-moe-split for MoE shard generation.
Bundle naming now follows the same convention everywhere:
- macOS bundles package
rpc-server-metalandllama-server-metal - generic Linux bundles package
rpc-server-cpuandllama-server-cpu - CUDA Linux bundles package
rpc-server-cudaandllama-server-cuda - ROCm Linux bundles package
rpc-server-rocmandllama-server-rocm - Vulkan Linux bundles package
rpc-server-vulkanandllama-server-vulkan
On Windows, create release archives directly:
just release-bundle-windows v0.X.0
just release-bundle-cuda-windows v0.X.0
just release-bundle-amd-windows v0.X.0
just release-bundle-vulkan-windows v0.X.0Those commands emit .zip assets in dist/ with mesh-llm.exe, plus flavor-specific rpc-server-<flavor>.exe and llama-server-<flavor>.exe.
If optional Windows benchmark binaries such as membench-fingerprint-cuda.exe or membench-fingerprint-hip.exe are present in mesh-llm/target/release/, the PowerShell packager also includes them in the .zip.
mkdir /tmp/test-bundle && tar xzf /tmp/mesh-bundle.tar.gz -C /tmp/test-bundle --strip-components=1
/tmp/test-bundle/mesh-llm --model Qwen2.5-3B
# Should download model, start solo, API on :9337, console on :3131
# Hit http://localhost:9337/v1/chat/completions to verify inference works
# Ctrl+C to stop
rm -rf /tmp/test-bundlejust release v0.X.0Run this from a clean local main branch. It bumps the version in source + Cargo manifests, refreshes Cargo.lock without upgrading dependencies, commits as v0.X.0: release, pushes main, and then pushes only the new release tag.
Pushing a v* tag triggers .github/workflows/release.yml, which:
- builds release bundles on macOS, Linux CPU, Linux CUDA, Linux ROCm, Linux Vulkan, and Windows CPU/CUDA/ROCm/Vulkan
- uses hosted
windows-2022runners for Windows and installs the needed SDKs during the workflow - uploads versioned assets such as
mesh-llm-v0.X.0-aarch64-apple-darwin.tar.gz - uploads stable
latestassets such asmesh-llm-x86_64-unknown-linux-gnu.tar.gz - uploads CUDA-specific Linux assets such as
mesh-llm-x86_64-unknown-linux-gnu-cuda.tar.gz - uploads ROCm-specific Linux assets such as
mesh-llm-x86_64-unknown-linux-gnu-rocm.tar.gz - uploads Vulkan-specific Linux assets such as
mesh-llm-x86_64-unknown-linux-gnu-vulkan.tar.gz - uploads Windows CPU assets such as
mesh-llm-x86_64-pc-windows-msvc.zip - uploads Windows CUDA assets such as
mesh-llm-x86_64-pc-windows-msvc-cuda.zip - uploads Windows ROCm assets such as
mesh-llm-x86_64-pc-windows-msvc-rocm.zip - uploads Windows Vulkan assets such as
mesh-llm-x86_64-pc-windows-msvc-vulkan.zip - keeps the legacy macOS
mesh-bundle.tar.gzasset available for direct archive installs - creates the GitHub release automatically with generated notes
After the workflow finishes, verify:
mesh-bundle.tar.gzstill exists for direct macOS archive installsmesh-llm-aarch64-apple-darwin.tar.gzexistsmesh-llm-x86_64-unknown-linux-gnu.tar.gzexistsmesh-llm-x86_64-unknown-linux-gnu-cuda.tar.gzexistsmesh-llm-x86_64-unknown-linux-gnu-rocm.tar.gzexistsmesh-llm-x86_64-unknown-linux-gnu-vulkan.tar.gzexistsmesh-llm-x86_64-pc-windows-msvc.zipexistsmesh-llm-x86_64-pc-windows-msvc-cuda.zipexistsmesh-llm-x86_64-pc-windows-msvc-rocm.zipexistsmesh-llm-x86_64-pc-windows-msvc-vulkan.zipexists
- The unversioned asset name
mesh-bundle.tar.gzis still kept for compatibility with direct archive installs. - The default Linux release bundle is a generic CPU build.
- Windows source builds exist, and tagged releases now publish Windows CPU/CUDA/ROCm/Vulkan
.zipassets. - Windows release artifacts can still be generated locally with the
*-windowsrelease recipes inJustfile. - Release bundles use flavor-specific
rpc-server-<flavor>andllama-server-<flavor>names so multiple flavors can coexist in one install directory. Usemesh-llm --llama-flavor <flavor>to force a specific pair. - The CUDA Linux release bundle is built in CI with an explicit multi-arch
CMAKE_CUDA_ARCHITECTURESlist and is not runtime-tested during the workflow. - The ROCm and Vulkan Linux release bundles are compile-tested in CI, but not runtime-tested against real GPUs during the workflow.
- The Windows release workflows are compile-and-package only. They do not run inference tests against real GPUs during the workflow.
codesignandxattrmay be needed on the receiving machine if macOS Gatekeeper blocks unsigned binaries:codesign -s - /usr/local/bin/mesh-llm /usr/local/bin/rpc-server /usr/local/bin/llama-server /usr/local/bin/llama-moe-split xattr -cr /usr/local/bin/mesh-llm /usr/local/bin/rpc-server /usr/local/bin/llama-server /usr/local/bin/llama-moe-split