ai-dynamo · rmccorm4 · Aug 20, 2025 · Aug 20, 2025 · Aug 20, 2025
diff --git a/components/backends/trtllm/gpt-oss.md b/components/backends/trtllm/gpt-oss.md
@@ -211,7 +211,27 @@ CUDA_VISIBLE_DEVICES=4,5,6,7 python3 -m dynamo.trtllm \
   --expert-parallel-size 4
 ```
 
-### 6. Test the Deployment
+### 6. Verify the Deployment is Ready
+
+Poll the `/health` endpoint to verify that both the prefill and decode worker endpoints have started:
+```
+curl http://localhost:8000/health
+```
+
+Make sure that both of the endpoints are available before sending an inference request:
+```
+{
+  "endpoints": [
+    "dyn://dynamo.tensorrt_llm.generate",
+    "dyn://dynamo.tensorrt_llm_next.generate"
+  ],
+  "status": "healthy"
+}
+```
+
+If only one worker endpoint is listed, the other may still be starting up. Monitor the worker logs to track startup progress.
+
+### 7. Test the Deployment
 
 Send a test request to verify the deployment: