Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion recipes/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@ This repository contains production-ready recipes for deploying large language m
| deepseek-r1 | sglang | disagg (1 node, wide-ep) | 8x H200 | ✅ | 🚧 |🚧 |
| deepseek-r1 | sglang | disagg (multi-node, wide-ep) | 16x H200 | ✅ | 🚧 |🚧 |
| gpt-oss-120b | trtllm | agg | 4x GB200 | ✅ | ✅ |🚧 |
| Qwen3-235B-A22B | trtllm | agg | 16x H200 | ✅ | ✅ |🚧 |
| Qwen3-235B-A22B | trtllm | disagg | 16x H200 | ✅ | ✅ |🚧 |

**Legend:**
- ✅ Functional
Expand Down Expand Up @@ -294,4 +296,4 @@ kubectl wait --for=condition=Complete job/$PERF_JOB_NAME -n $NAMESPACE --timeout
```bash
# Check final benchmark results
kubectl logs job/$PERF_JOB_NAME -n $NAMESPACE | tail -50
```
```
6 changes: 2 additions & 4 deletions recipes/qwen3-235b-a22b-fp8/trtllm/disagg/deploy.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -128,8 +128,7 @@ spec:
--model-path "${MODEL_PATH}" \
--served-model-name "Qwen/Qwen3-235B-A22B-FP8" \
--extra-engine-args "${ENGINE_ARGS}" \
--disaggregation-mode prefill \
--disaggregation-strategy prefill_first
--disaggregation-mode prefill
volumeMounts:
- name: prefill-config
mountPath: /engine_configs
Expand Down Expand Up @@ -180,8 +179,7 @@ spec:
--model-path "${MODEL_PATH}" \
--served-model-name "Qwen/Qwen3-235B-A22B-FP8" \
--extra-engine-args "${ENGINE_ARGS}" \
--disaggregation-mode decode \
--disaggregation-strategy prefill_first
--disaggregation-mode decode
volumeMounts:
- name: decode-config
mountPath: /engine_configs
Expand Down
Loading