ai-dynamo · dmitry-tokarev-nv · Aug 1, 2025 · Aug 1, 2025 · Aug 1, 2025
diff --git a/components/backends/sglang/docs/dsr1-wideep-h100.md b/components/backends/sglang/docs/dsr1-wideep-h100.md
@@ -50,16 +50,16 @@ In each container, you should be in the `/sgl-workspace/dynamo/components/backen
 4. On the head prefill node, run the helper script provided to generate commands to start the `nats-server`, `etcd`. This script will also tell you which environment variables to export on each node to make deployment easier.
 
 ```bash
-./utils/gen_env_vars.sh
+./components/backends/sglang/src/dynamo/sglang/utils/gen_env_vars.sh
 ```
 
 5. Run the ingress and prefill worker
 
 ```bash
 # run ingress
-dynamo run in=http out=dyn &
+python3 -m dynamo.frontend --http-port=8000 &
 # optionally run the http server that allows you to flush the kv cache for all workers (see benchmarking section below)
-python3 utils/sgl_http_server.py --ns dynamo &
+python3 -m dynamo.sglang.utils.sgl_http_server --ns dynamo &
 # run prefill worker
 python3 -m dynamo.sglang.worker \
   --model-path /model/ \

diff --git a/components/backends/sglang/docs/multinode-examples.md b/components/backends/sglang/docs/multinode-examples.md
@@ -19,7 +19,7 @@ SGLang allows you to deploy multi-node sized models by adding in the `dist-init-
 Node 1: Run HTTP ingress, processor, and 8 shards of the prefill worker
 ```bash
 # run ingress
-dynamo run in=http out=dyn &
+python3 -m dynamo.frontend --http-port=8000 &
 # run prefill worker
 python3 -m dynamo.sglang.worker \
   --model-path /model/ \
@@ -102,7 +102,7 @@ SGLang typically requires a warmup period to ensure the DeepGEMM kernels are loa
 curl ${HEAD_PREFILL_NODE_IP}:8000/v1/chat/completions \
   -H "Content-Type: application/json" \
   -d '{
-    "model": "deepseek-ai/DeepSeek-R1-Distill-Llama-8B",
+    "model": "deepseek-ai/DeepSeek-R1",
     "messages": [
     {
         "role": "user",

diff --git a/components/backends/sglang/docs/sgl-http-server.md b/components/backends/sglang/docs/sgl-http-server.md
@@ -74,7 +74,7 @@ The server accepts the following command-line arguments:
 
 Start the server:
 ```bash
-python src/dynamo/sglang/utils/sgl_http_server.py --port 9001 --namespace dynamo
+python3 -m dynamo.sglang.utils.sgl_http_server --ns dynamo
 ```
 
 The server will automatically discover all SGLang components in the specified namespace and provide HTTP endpoints for managing them.