Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions components/backends/sglang/docs/dsr1-wideep-h100.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,16 +50,16 @@ In each container, you should be in the `/sgl-workspace/dynamo/components/backen
4. On the head prefill node, run the helper script provided to generate commands to start the `nats-server`, `etcd`. This script will also tell you which environment variables to export on each node to make deployment easier.

```bash
./utils/gen_env_vars.sh
./components/backends/sglang/src/dynamo/sglang/utils/gen_env_vars.sh
```

5. Run the ingress and prefill worker

```bash
# run ingress
dynamo run in=http out=dyn &
python3 -m dynamo.frontend --http-port=8000 &
# optionally run the http server that allows you to flush the kv cache for all workers (see benchmarking section below)
python3 utils/sgl_http_server.py --ns dynamo &
python3 -m dynamo.sglang.utils.sgl_http_server --ns dynamo &
# run prefill worker
python3 -m dynamo.sglang.worker \
--model-path /model/ \
Expand Down
4 changes: 2 additions & 2 deletions components/backends/sglang/docs/multinode-examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ SGLang allows you to deploy multi-node sized models by adding in the `dist-init-
Node 1: Run HTTP ingress, processor, and 8 shards of the prefill worker
```bash
# run ingress
dynamo run in=http out=dyn &
python3 -m dynamo.frontend --http-port=8000 &
# run prefill worker
python3 -m dynamo.sglang.worker \
--model-path /model/ \
Expand Down Expand Up @@ -102,7 +102,7 @@ SGLang typically requires a warmup period to ensure the DeepGEMM kernels are loa
curl ${HEAD_PREFILL_NODE_IP}:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-ai/DeepSeek-R1-Distill-Llama-8B",
"model": "deepseek-ai/DeepSeek-R1",
"messages": [
{
"role": "user",
Expand Down
2 changes: 1 addition & 1 deletion components/backends/sglang/docs/sgl-http-server.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ The server accepts the following command-line arguments:

Start the server:
```bash
python src/dynamo/sglang/utils/sgl_http_server.py --port 9001 --namespace dynamo
python3 -m dynamo.sglang.utils.sgl_http_server --ns dynamo
```

The server will automatically discover all SGLang components in the specified namespace and provide HTTP endpoints for managing them.
Loading