docs: Bring back some missed release/0.4.0 doc changes, fix broken links, add lychee link checker github action #2482

rmccorm4 · 2025-08-17T02:14:01Z

Overview:

Bringing over various release/0.4.0 documentation changes from @athreesh that weren't added to main

Fixes all links (covers docs: Final fixes to links reported by QA #2334 and docs: address sphinx build errors for docs.nvidia.com #2346)
Copies over index.rst and hidden_toctree.rst from release/0.4.0 (covers docs: Clean index.rst #2104 and docs: address sphinx build errors for docs.nvidia.com #2346)
Renames examples/multimodal_v1 to examples/multimodal since we don't have a v0 anymore (@krishung5 @GuanLuo @indrajit96 FYI)
- Also FYI, we now have nixl_connect stuff in two places again - let's bring it back to one place: [BUG]: Remove duplicate nixl_connect implementations #2481
Adds github action for link validation to help automate and maintain correct links (closes ci: Add lychee link check #2362)

$ lychee \
    --no-progress \
    --exclude-path "ATTRIBUTIONS.*" \
    --accept "200..=299, 403, 429" \
    --exclude-all-private --exclude 0.0.0.0 \
    .
...
🔍 612 Total (in 21s) ✅ 607 OK 🚫 0 Errors 👻 5 Excluded

Details:

It wasn't easy to just cherry-pick merged changes because many docs and paths have changed since a few weeks ago, so instead this was a fairly manual effort, and it's likely that some things were missed - @athreesh

Future Follow-up:

Anything missing can be brought over in docs: fix index.rst for 0.4.0 #2263 or follow-up
Address [DOCS]: Bring back benchmarking guide #2031

Summary by CodeRabbit

New Features
- Added automated documentation link checking in CI.
Documentation
- Updated and corrected numerous links and paths across READMEs and docs.
- Revamped docs landing page with Local and Kubernetes deployment guides, expanded Examples, and reorganized references.
- Expanded hidden toctree for broader coverage.
- Refreshed metrics guide with clearer real-worker setup.
- Bumped release version in quickstart to 0.4.0 and updated registry URLs.
- Standardized multimodal example paths and cross-references.
Tests
- Adjusted test paths to new multimodal example directory.
Chores
- Introduced link-check workflow using lychee.

…to docs/ folder

…x relative paths to docs

…ve path to docs/ folder" This reverts commit c83ca40.

…inks, fix relative paths to docs" This reverts commit 5bb296a.

coderabbitai

Actionable comments posted: 7

🔭 Outside diff range comments (4)

README.md (1)

1-1: Convert site-root links to repository-relative paths

Lychee warnings indicate several markdown links starting with /, which won’t resolve correctly on GitHub. Please remove the leading slash in each link. For example, change:

/docs/architecture/disagg_serving.md → docs/architecture/disagg_serving.md

/lib/bindings/python/README.md → lib/bindings/python/README.md

/components/backends/vllm/README.md → components/backends/vllm/README.md

Occurrences found by the scan:

• docs/architecture/README.md
Lines 58–63:
Disaggregated Serving
Conditional Disaggregation
KV-Aware Routing
Load Based Planner
SLA-Based Planner
KVBM

• (Your top-level or component README)
Line 112: See the README.md for details

• (Documentation or examples referencing VLLM backend)
Lines 47, 125, 206, 270:
…like the LLM aggregated serving example.

Please update all these links (and any others found by the script) to remove the leading /, ensuring they point correctly within the repo.
components/backends/trtllm/deploy/README.md (1)
244-244: Rename kv-cache-tranfer.md to kv-cache-transfer.md and update README link
The KV cache guide filename is misspelled (tranfer → transfer), so both the file and its reference in the README must be corrected:

Rename file:
components/backends/trtllm/kv-cache-tranfer.md → components/backends/trtllm/kv-cache-transfer.md

Update link in:
components/backends/trtllm/deploy/README.md

Proposed diff:
diff --git a/components/backends/trtllm/kv-cache-tranfer.md b/components/backends/trtllm/kv-cache-transfer.md
similarity index 100%
rename from components/backends/trtllm/kv-cache-tranfer.md
rename to components/backends/trtllm/kv-cache-transfer.md

diff --git a/components/backends/trtllm/deploy/README.md b/components/backends/trtllm/deploy/README.md
index abcdef1..1234567 100644
--- a/components/backends/trtllm/deploy/README.md
+++ b/components/backends/trtllm/deploy/README.md
@@ -244,7 +244,7 @@
-For detailed configuration instructions, see the [KV cache transfer guide](../kv-cache-tranfer.md).
+For detailed configuration instructions, see the [KV cache transfer guide](../kv-cache-transfer.md).
examples/basics/multinode/README.md (1)
118-121: Fix typo in the networking note

Minor typo: “follwing” → “following”.

Apply this diff:
-# To find your IP address, run the follwing on your infrastructure node:
+# To find your IP address, run the following on your infrastructure node:
docs/guides/dynamo_deploy/quickstart.md (1)
70-77: Clarify NGC/registry wording and fix grammar

The sentence mixes nvcr.io with a link to catalog.ngc.nvidia.com and uses “setup” instead of “set up”.

Apply this diff:
-Our examples use the [`nvcr.io`](https://catalog.ngc.nvidia.com) but you can setup your own values if you use another docker registry.
+Our examples use the NGC container registry (nvcr.io). You can set up your own values if you use a different container registry.
Optionally, you can link “NGC container registry” to https://ngc.nvidia.com for clarity.

🧹 Nitpick comments (17)

README.md (1)
186-186: Avoid bare URL (markdownlint MD034) and improve wording

Wrap the URL in markdown link syntax to satisfy linters and readability.

Apply this diff:
-You can pass any sglang flags directly to this worker, see https://docs.sglang.ai/advanced_features/server_arguments.html . See there to use multiple GPUs.
+You can pass any SGLang flags directly to this worker; see the [server arguments docs](https://docs.sglang.ai/advanced_features/server_arguments.html). See that page for multi‑GPU usage.
examples/basics/multinode/README.md (1)
155-161: Clean up stray “>” characters in the INFO block

There are stray “>” characters within lines that likely came from formatting artifacts and will render oddly.

Apply this diff to remove them:
-> - `CUDA_VISIBLE_DEVICES`: Controls which GPU each worker uses (0 and 1 for different > GPUs)
+> - `CUDA_VISIBLE_DEVICES`: Controls which GPU each worker uses (0 and 1 for different GPUs)
 > - `--page-size 16`: Sets the KV cache block size - must be identical across all workers
- > - `--disaggregation-mode`: Separates prefill (prompt processing) from decode (token > generation)
+> - `--disaggregation-mode`: Separates prefill (prompt processing) from decode (token generation)
- > - `--disaggregation-transfer-backend nixl`: Enables high-speed GPU-to-GPU transfers
+> - `--disaggregation-transfer-backend nixl`: Enables high-speed GPU-to-GPU transfers
- > - `--skip-tokenizer-init`: Avoids duplicate tokenizer loading since the frontend > handles tokenization
+> - `--skip-tokenizer-init`: Avoids duplicate tokenizer loading since the frontend handles tokenization
docs/guides/dynamo_deploy/quickstart.md (1)
85-86: Keep release examples consistent with 0.4.0

The inline example still references 0.3.2; update to 0.4.0 for consistency.

Apply this diff:
-export IMAGE_TAG=RELEASE_VERSION # i.e. 0.3.2 - the release you are using
+export IMAGE_TAG=RELEASE_VERSION # e.g., 0.4.0 - the release you are using
benchmarks/llm/README.md (1)
15-16: Optionally reference a tracking issue

“Coming soon.” is fine to avoid broken links. Consider adding a reference to a tracking issue or milestone so readers know where to follow progress.

For example:
-Coming soon.
+Coming soon. Track progress in issue #2031 (or the relevant tracking issue/milestone).
deploy/metrics/README.md (1)
68-75: Clarify location: docker-compose.yml is not "in this directory".

The bullet now correctly points one level up, but the preceding sentence still says "in this directory," which is misleading.

Apply this diff to clarify:
-### Required Files
-
-The following configuration files should be present in this directory:
+### Required Files
+
+The following configuration files are required (note: docker-compose.yml is located in the parent deploy directory):
 - [docker-compose.yml](../docker-compose.yml): Defines the Prometheus and Grafana services
components/backends/trtllm/README.md (2)
196-196: Grammar fix: "send a request".

Tiny polish for readability.
-See [client](../vllm/README.md#client) section to learn how to send request to the deployment.
+See the [client](../vllm/README.md#client) section to learn how to send a request to the deployment.
236-236: Grammar fix: "send a request".

Same nit in the repeated Client section.
-See [client](../vllm/README.md#client) section to learn how to send request to the deployment.
+See the [client](../vllm/README.md#client) section to learn how to send a request to the deployment.
examples/multimodal/connect/README.md (2)
118-124: Avoid relative GitHub line anchors; they don't work on relative file links.

Consolidating to worker.py is fine.

The link on Line 122 to encode_worker.py includes #L190 which won’t jump to a line in GitHub when used as a relative path. Prefer dropping the #L… anchor or linking to a stable symbol/section.

Apply these edits:
-See [prefill_worker](../components/worker.py) or [decode_worker](../components/worker.py),
+See [prefill/decode logic in worker.py](../components/worker.py),
@@
-See [encode_worker](../components/encode_worker.py#L190),
+See [encode_worker](../components/encode_worker.py),
341-342: Fix list indentation (markdownlint MD007).

Removes the extra indentation on the list item.
-  - [Dynamo Multimodal Example](../../../examples/multimodal)
+- [Dynamo Multimodal Example](../../../examples/multimodal)
components/backends/sglang/deploy/README.md (1)
77-77: Relative path fix looks correct; minor wording nit

The deeper relative path resolves correctly from this directory. Consider tightening the phrasing/punctuation for readability.

Apply this minimal tweak:
-1. **Dynamo Cloud Platform installed** - See [Installing Dynamo Cloud](../../../../docs/guides/dynamo_deploy/dynamo_cloud.md)
+1. **Dynamo Cloud Platform installed** — see [Installing Dynamo Cloud](../../../../docs/guides/dynamo_deploy/dynamo_cloud.md).
components/metrics/README.md (1)
83-98: Clarify worker naming and verify module entrypoint.

Nit: "VllmWorkers" casing is inconsistent. Prefer "vLLM workers" or "VLLM workers" for consistency with the backend docs.

Request: Please verify the module entrypoint is correct: python -m dynamo.vllm --model-path <your-model-checkout>. If the intended entrypoint lives under components/backends, it might be dynamo.backends.vllm or a worker submodule.

If it needs correction, here’s a minimal fix:
-Then, to monitor the metrics of these VllmWorkers, run:
+Then, to monitor the metrics of these vLLM workers, run:
.github/workflows/docs-link-check.yml (2)
34-43: Make link checks faster, more stable, and reproducible.

Add caching and rate-limiting options to reduce flakes.

Consider moving flags to a checked-in lychee.toml for consistency locally and in CI.

Proposed improvements:
-          lychee \
-            --no-progress \
-            --exclude-path "ATTRIBUTIONS.*" \
-            --accept "200..=299, 403, 429" \
-            --exclude-all-private --exclude 0.0.0.0 \
-            .
+          lychee \
+            --no-progress \
+            --cache --max-cache-age 1d \
+            --retry-wait-time 2 \
+            --max-retries 2 \
+            --exclude-path "ATTRIBUTIONS.*" \
+            --accept "200..=299, 403, 429" \
+            --exclude-all-private --exclude 0.0.0.0 \
+            .
Optional: add an upload-artifact step for the lychee report, and/or a scheduled cron run to detect link rot outside PRs.

3-8: Optional: Add a scheduled run for proactive link-rot detection.

Running on PRs and main pushes is great. Adding a weekly cron helps catch link rot between changes.

Example:
 on:
   push:
     branches:
     - main
   pull_request:
+  schedule:
+    - cron: "0 6 * * 1" # weekly, 06:00 UTC Mondays
docs/API/nixl_connect/README.md (4)
106-121: Fix “KV$” typo and minor phrasing clarity in the multimodal flow.

Typo on Line 118: “KV$” likely meant “KV cache” or “KV-cache”.

Apply:
- 6. Prefill Worker receives the embeddings from Encode Worker and generates a key-value cache (KV$) update for Decode Worker's LLM and writes the update directly to the GPU memory reserved for the data.
+ 6. Prefill Worker receives the embeddings from Encode Worker and generates a key-value cache (KV) update for Decode Worker's LLM, writing the update directly to the GPU memory reserved for the data.
156-159: Tighten phrasing; fix article usage.

Minor grammar/readability improvements.
-See [prefill_worker](../../../examples/multimodal/components/worker.py) or [decode_worker](../../../examples/multimodal/components/worker.py) from our Multimodal example,
-for how they coordinate directly with the Encode Worker by creating a [`WritableOperation`](writable_operation.md),
-sending the operation's metadata via Dynamo's round-robin dispatcher, and awaiting the operation for completion before making use of the transferred data.
+See [prefill_worker](../../../examples/multimodal/components/worker.py) and [decode_worker](../../../examples/multimodal/components/worker.py) in our multimodal example
+for how they coordinate with the Encode Worker by creating a [`WritableOperation`](writable_operation.md),
+sending the operation metadata via Dynamo's round-robin dispatcher, and awaiting completion before using the transferred data.
160-164: Fix double slash in link and “awaits for” grammar; consider avoiding line anchors.

Remove the extra slash in the relative link.

Fix “awaits for” to “waits for.”

Optional: linking to specific line numbers (#L190) is brittle; consider a permalink to a commit or omit the anchor.
-See [encode_worker](../../..//examples/multimodal/components/encode_worker.py#L190) from our Multimodal example,
-for how the resulting embeddings are registered with the NIXL subsystem by creating a [`Descriptor`](descriptor.md),
-a [`WriteOperation`](write_operation.md) is created using the metadata provided by the requesting worker,
-and the worker awaits for the data transfer to complete for yielding a response.
+See [encode_worker](../../../examples/multimodal/components/encode_worker.py#L190) in our multimodal example
+for how the resulting embeddings are registered with the NIXL subsystem by creating a [`Descriptor`](descriptor.md),
+creating a [`WriteOperation`](write_operation.md) with metadata provided by the requesting worker,
+and waiting for the data transfer to complete before yielding a response.
If you want a stable anchor, use a commit permalink, e.g.:
https://github.com/ai-dynamo/dynamo/blob/<commit>/examples/multimodal/components/encode_worker.py#L190

181-182: Remove extra slash in relative link.

Minor path cleanup; current link works but is messy.
-  - [Dynamo Multimodal Example](../../..//examples/multimodal)
+  - [Dynamo Multimodal Example](../../../examples/multimodal)

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between fb10ffb and a8e5396.

📒 Files selected for processing (29)

.github/workflows/docs-link-check.yml (1 hunks)
README.md (1 hunks)
benchmarks/llm/README.md (1 hunks)
components/README.md (1 hunks)
components/backends/sglang/README.md (1 hunks)
components/backends/sglang/deploy/README.md (2 hunks)
components/backends/sglang/docs/dsr1-wideep-h100.md (1 hunks)
components/backends/sglang/slurm_jobs/README.md (2 hunks)
components/backends/trtllm/README.md (2 hunks)
components/backends/trtllm/deploy/README.md (1 hunks)
components/backends/trtllm/gpt-oss.md (1 hunks)
components/backends/vllm/LMCache_Integration.md (1 hunks)
components/metrics/README.md (1 hunks)
deploy/metrics/README.md (1 hunks)
docs/API/nixl_connect/README.md (3 hunks)
docs/architecture/dynamo_flow.md (1 hunks)
docs/guides/deploy/k8s_metrics.md (2 hunks)
docs/guides/dynamo_deploy/README.md (1 hunks)
docs/guides/dynamo_deploy/model_caching_with_fluid.md (1 hunks)
docs/guides/dynamo_deploy/multinode-deployment.md (2 hunks)
docs/guides/dynamo_deploy/quickstart.md (2 hunks)
docs/hidden_toctree.rst (2 hunks)
docs/index.rst (3 hunks)
examples/basics/disaggregated_serving/README.md (1 hunks)
examples/basics/multinode/README.md (2 hunks)
examples/basics/quickstart/README.md (1 hunks)
examples/multimodal/README.md (4 hunks)
examples/multimodal/connect/README.md (2 hunks)
tests/serve/test_vllm.py (2 hunks)

🧰 Additional context used

🧠 Learnings (2)

📚 Learning: 2025-07-02T13:20:28.800Z

Learnt from: fsaady
PR: ai-dynamo/dynamo#1730
File: examples/sglang/slurm_jobs/job_script_template.j2:59-59
Timestamp: 2025-07-02T13:20:28.800Z
Learning: In the SLURM job script template at examples/sglang/slurm_jobs/job_script_template.j2, the `--total_nodes` parameter represents the total nodes per worker type (prefill or decode), not the total nodes in the entire cluster. Each worker type needs to know its own group size for distributed coordination.

Applied to files:

components/backends/sglang/slurm_jobs/README.md

📚 Learning: 2025-07-03T10:14:30.570Z

Learnt from: fsaady
PR: ai-dynamo/dynamo#1730
File: examples/sglang/slurm_jobs/scripts/worker_setup.py:230-244
Timestamp: 2025-07-03T10:14:30.570Z
Learning: In examples/sglang/slurm_jobs/scripts/worker_setup.py, background processes (like nats-server, etcd) are intentionally left running even if later processes fail. This design choice allows users to manually connect to nodes and debug issues without having to restart the entire SLURM job from scratch, providing operational flexibility for troubleshooting in cluster environments.

Applied to files:

components/backends/sglang/slurm_jobs/README.md

🪛 LanguageTool

components/backends/trtllm/README.md

[grammar] ~196-~196: There might be a mistake here.
Context: ...ADME.md#client) section to learn how to send request to the deployment. NOTE: To se...

(QB_NEW_EN)

[grammar] ~236-~236: There might be a mistake here.
Context: ...ADME.md#client) section to learn how to send request to the deployment. NOTE: To se...

(QB_NEW_EN)

components/backends/sglang/deploy/README.md

[grammar] ~77-~77: There might be a mistake here.
Context: ...stalled** - See Installing Dynamo Cloud 2. Kubernetes cluster with GPU support 3....

(QB_NEW_EN)

examples/basics/disaggregated_serving/README.md

[grammar] ~39-~39: There might be a mistake here.
Context: ...s and forwards them to the decode worker - [vLLM Prefill Worker](../../../components...

(QB_NEW_EN)

[grammar] ~40-~40: There might be a mistake here.
Context: ...lized worker for prefill phase execution - [vLLM Decode Worker](../../../components/...

(QB_NEW_EN)

docs/guides/dynamo_deploy/multinode-deployment.md

[grammar] ~53-~53: There might be a mistake here.
Context: ...ple nodes. - LWS: LWS Installation - Volcano: [Volcano Installation](https:...

(QB_NEW_EN)

[grammar] ~113-~113: There might be a mistake here.
Context: ...g**: components/backends/sglang/deploy/ - TensorRT-LLM: [components/backends/trt...

(QB_NEW_EN)

[grammar] ~114-~114: There might be a mistake here.
Context: ...M**: components/backends/trtllm/deploy/ - vLLM: [components/backends/vllm/deploy...

(QB_NEW_EN)

docs/API/nixl_connect/README.md

[grammar] ~157-~157: There might be a mistake here.
Context: ...tableOperation`](writable_operation.md), sending the operation's metadata via Dyn...

(QB_NEW_EN)

🪛 markdownlint-cli2 (0.17.2)

README.md

186-186: Bare URL used

(MD034, no-bare-urls)

examples/multimodal/connect/README.md

341-341: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)

🪛 GitHub Actions: Docs link check

README.md

[warning] 1-1: Lychee: InvalidPathToUri '/components/backends/vllm/README.md'.

[warning] 1-1: Lychee: InvalidPathToUri '/lib/bindings/python/README.md'.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: pre-merge-rust (.)
GitHub Check: Build and Test - dynamo

🔇 Additional comments (25)

examples/basics/disaggregated_serving/README.md (1)

39-41: Normalized README.md link targets look good

Switching component links to README.md is consistent with the repo and should reduce link rot. Relative paths also resolve correctly from this location.

docs/guides/dynamo_deploy/model_caching_with_fluid.md (1)

321-321: LGTM: Updated MinIO docs link

Switch to https://docs.min.io/ is correct and resolves cleanly.

docs/architecture/dynamo_flow.md (1)

20-20: Diagram reference updated to vLLM backend — correct relative path.

The path resolves from docs/architecture to components/backends/vllm as intended.

components/backends/vllm/LMCache_Integration.md (1)

169-169: LMCache Configuration Reference link updated — LGTM.

The new URL reflects the current docs structure.

examples/basics/quickstart/README.md (1)

20-21: Switch to explicit README.md targets — good standardization.

Both links resolve correctly from this location.

components/backends/trtllm/gpt-oss.md (1)

348-348: Relative path correction to multinode guide — looks correct.

From trtllm to repo root then into examples/basics/multinode/README.md is accurate.

examples/basics/multinode/README.md (2)

88-88: Link target correction looks good

The relative path to the SGLang backend README is correct from this location.

213-213: Frontend docs link fix is correct

The updated relative path to the Frontend component README is accurate from this file.

docs/guides/dynamo_deploy/README.md (1)

57-58: VLLM YAML example link fix is correct

The new path resolves correctly to the repository-rooted location for agg.yaml.

components/README.md (1)

80-81: Doc link path correction is correct

Switching to ../docs/ from this file’s location is the correct relative path.

docs/guides/dynamo_deploy/quickstart.md (1)

17-18: Version bump to 0.4.0: confirm chart availability

Looks good, but please verify the 0.4.0 CRDs and Platform Helm charts are published to NGC before merging, so users don’t hit 404s.

You can verify availability manually by visiting the NGC Helm index for ai-dynamo or by attempting to fetch:

https://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-crds-0.4.0.tgz

https://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-platform-0.4.0.tgz

tests/serve/test_vllm.py (3)

263-275: Path update to examples/multimodal looks good.

This aligns the test with the repo-wide rename from multimodal_v1 to multimodal.

276-290: Keep commented test in sync with path changes.

The commented-out multimodal_disagg case reflects the same path update. Good to keep it consistent for future enablement.

263-275: Launch scripts verified
Both examples/multimodal/launch/agg.sh and examples/multimodal/launch/disagg.sh are present under the expected path, so the VLLMProcess script lookup will succeed. No further changes are required.

examples/multimodal/README.md (2)

63-70: LGTM: multimodal_v1 → multimodal path updates are consistent.

The working directory paths now point to examples/multimodal across all sections.

Also applies to: 141-143, 218-220, 284-292

62-66: All ‘multimodal_v1’ references have been removed
A global search (rg -n "multimodal_v1") returned no matches. No stale links remain. ✅

components/backends/sglang/slurm_jobs/README.md (2)

3-3: Link path updates LGTM

The relative links to the H100 WideEP doc are now correct from slurm_jobs/.

Also applies to: 7-7

60-60: Verified “#instructions” anchor exists in dsr1-wideep-h100.md
Confirmed the ## Instructions heading is present, so the deep link will resolve correctly.
docs/hidden_toctree.rst (2)

1-1: Good use of :orphan: to suppress toctree warnings

This prevents Sphinx build noise for the hidden collection.

26-58: Sanity-check all new hidden toctree targets exist

The previous script failed due to unsupported process-substitution. Please replace it with the following here-document loop and rerun from the repo root to catch any missing files:
#!/usr/bin/env bash
set -euo pipefail

docs_root="docs"
missing=0

# List every toctree target, one per line
while read -r p; do
  f="${docs_root}/${p}"
  if [[ ! -f "$f" ]]; then
    echo "Missing: $f" >&2
    ((missing++))
  fi
done <<'EOF'
runtime/README.md
API/nixl_connect/connector.md
API/nixl_connect/descriptor.md
API/nixl_connect/device.md
API/nixl_connect/device_kind.md
API/nixl_connect/operation_status.md
API/nixl_connect/rdma_metadata.md
API/nixl_connect/readable_operation.md
API/nixl_connect/writable_operation.md
API/nixl_connect/read_operation.md
API/nixl_connect/write_operation.md
components/backends/sglang/deploy/README.md
components/backends/sglang/docs/dsr1-wideep-h100.md
components/backends/sglang/docs/multinode-examples.md
components/backends/sglang/docs/sgl-http-server.md
components/backends/sglang/slurm_jobs/README.md
components/router/README.md
examples/README.md
guides/dynamo_deploy/create_deployment.md
guides/dynamo_deploy/sla_planner_deployment.md
guides/dynamo_deploy/helm_install.md
guides/dynamo_deploy/gke_setup.md
guides/dynamo_deploy/README.md
guides/dynamo_run.md
components/backends/vllm/README.md
components/backends/trtllm/README.md
components/backends/trtllm/deploy/README.md
components/backends/trtllm/llama4_plus_eagle.md
components/backends/trtllm/multinode-examples.md
components/backends/trtllm/kv-cache-transfer.md
components/backends/vllm/deploy/README.md
components/backends/vllm/multi-node.md
EOF

if (( missing > 0 )); then
  echo "Found $missing missing toctree targets." >&2
  exit 2
fi

echo "OK: all toctree targets exist."
Run this before merging to avoid subtle broken links in your Sphinx build.
docs/guides/dynamo_deploy/multinode-deployment.md (2)

53-55: External link updates look right

Moving LWS to kubernetes-sigs and Volcano to volcano.sh is accurate.

113-118: All referenced README.md files exist
The relative paths in docs/guides/dynamo_deploy/multinode-deployment.md correctly point to existing README.md files:

components/backends/sglang/deploy/README.md — OK

components/backends/trtllm/deploy/README.md — OK

components/backends/vllm/deploy/README.md — OK

LGTM!

docs/index.rst (3)

36-79: Great local quickstart; verify sglang worker entrypoint is correct.

The end-to-end local flow looks solid. Please confirm python -m dynamo.sglang.worker is the intended entrypoint exposed by the [sglang] extra so users don’t hit ModuleNotFoundError.

153-154: Good consolidation of Helm docs into guides; check incoming links.

Renaming/moving “Manual Helm Deployment” to guides/dynamo_deploy/helm_install.md looks fine. Ensure any inbound links (in README/deploy docs) were updated accordingly.

174-175: Nice addition: NIXL Connect API now discoverable from Reference.

This improves navigation and aligns with the NIXL Connect README updates.

.github/workflows/docs-link-check.yml

components/backends/sglang/deploy/README.md

components/backends/sglang/docs/dsr1-wideep-h100.md

components/backends/sglang/README.md

components/backends/trtllm/deploy/README.md

docs/guides/deploy/k8s_metrics.md

docs/hidden_toctree.rst

rmccorm4 · 2025-08-18T17:21:18Z

vllm failures in gitlab pipeline unrelated to these changes, being seen on main from some unpinned versions installed around vllm/pytorch/etc.. Some investigation going on here: #2489

[BASH] RuntimeError: operator _C::aqlm_gemm does not exist

trtllm failures, unrelated, pre-existing failure with startup checks

E       RuntimeError: FAILED: Check URL: http://localhost:8000/v1/models

rmccorm4 · 2025-08-18T17:28:33Z

If anyone sees issues or failures with link checker related to rate limiting or things unrelated to actually invalid links - please let me know. We can consider making this check less frequent, only on docs changes, nightly, non-required to pass, etc. - lots of options based on what we encounter.

nealvaidya · 2025-08-18T17:32:36Z

If anyone sees issues or failures with link checker related to rate limiting or things unrelated to actually invalid links - please let me know. We can consider making this check less frequent, only on docs changes, nightly, non-required to pass, etc. - lots of options based on what we encounter.

Our sphinx docs building does already check internal docs links (see https://github.com/ai-dynamo/dynamo/actions/runs/17016141856/job/48238767603#step:4:944) but we don't have it fail if it detects dead links so its very easy to ignore. We may want to revisit that (or revisit the whole sphinx strategy)

rmccorm4 · 2025-08-18T17:42:03Z

Agree sphinx / docs build needs some love. I'm fairly out of the loop on it, so if you have a list of big issues with it, or ideas for improvements - please share @nealvaidya

…nks, add lychee link checker github action (#2482) Signed-off-by: Hannah Zhang <[email protected]>

rmccorm4 added 10 commits August 16, 2025 12:03

cp(#2351): Move backend READMEs to docs folder and fix relative path …

c83ca40

…to docs/ folder

cp(#2346): Move hello_world example README to docs, swap symlinks, fi…

5bb296a

…x relative paths to docs

docs: Copy over index.rst and hidden_toctree.rst from v0.4.0 to main

ce14f57

Revert "cp(#2351): Move backend READMEs to docs folder and fix relati…

5a8d52e

…ve path to docs/ folder" This reverts commit c83ca40.

Revert "cp(#2346): Move hello_world example README to docs, swap syml…

016d2c1

…inks, fix relative paths to docs" This reverts commit 5bb296a.

Rename multimodal_v1 to multimodal, and fix sglang link

915847c

Bring back missing benchmark README

201c7f9

Fix all broken links caught by lychee

241c614

Add github action for link validation (lychee)

eac4cc5

Update RELEASE_VERSION to 0.4.0 in dynamo_deploy quickstart

fc63b57

copy-pr-bot bot temporarily deployed to GITLAB August 17, 2025 02:14 Inactive

pull-request-size bot added the size/XL label Aug 17, 2025

github-actions bot added the docs label Aug 17, 2025

copy-pr-bot bot temporarily deployed to GITLAB August 17, 2025 02:17 Inactive

Merge branch 'main' into rmccormick/cp_anish_docs_to_main

ced1a73

copy-pr-bot bot temporarily deployed to GITLAB August 17, 2025 02:36 Inactive

copy-pr-bot bot temporarily deployed to GITLAB August 17, 2025 02:37 Inactive

rmccorm4 marked this pull request as ready for review August 17, 2025 02:38

rmccorm4 requested review from a team, biswapanda, hhzhang16, hutm, indrajit96, ishandhanani, julienmancuso, krishung5, nealvaidya, nnshah1 and whoisj as code owners August 17, 2025 02:38

copy-pr-bot bot temporarily deployed to GITLAB August 17, 2025 02:48 Inactive

coderabbitai bot reviewed Aug 17, 2025

View reviewed changes

Address CodeRabbit feedback

c3ec608

copy-pr-bot bot temporarily deployed to GITLAB August 17, 2025 03:00 Inactive

copy-pr-bot bot temporarily deployed to GITLAB August 17, 2025 03:01 Inactive

Address CodeRabbit feedback - add TODO in workflow for lychee install

4381aa8

copy-pr-bot bot temporarily deployed to GITLAB August 17, 2025 03:01 Inactive

copy-pr-bot bot temporarily deployed to GITLAB August 17, 2025 03:06 Inactive

Try installing ca-certs for cert errors

ef0b231

copy-pr-bot bot temporarily deployed to GITLAB August 17, 2025 03:07 Inactive

copy-pr-bot bot temporarily deployed to GITLAB August 17, 2025 03:11 Inactive

Set GITHUB_TOKEN to avoid github rate limits on URL checks

fdfd807

copy-pr-bot bot temporarily deployed to GITLAB August 17, 2025 03:18 Inactive

rmccorm4 added 2 commits August 16, 2025 20:22

Add lychee result caching

6b7c690

Add lychee result caching docs reference

3e60e42

copy-pr-bot bot temporarily deployed to GITLAB August 17, 2025 03:23 Inactive

copy-pr-bot bot temporarily deployed to GITLAB August 17, 2025 03:24 Inactive

ishandhanani approved these changes Aug 18, 2025

View reviewed changes

krishung5 approved these changes Aug 18, 2025

View reviewed changes

nv-anants approved these changes Aug 18, 2025

View reviewed changes

rmccorm4 merged commit 844f881 into main Aug 18, 2025
19 of 22 checks passed

rmccorm4 deleted the rmccormick/cp_anish_docs_to_main branch August 18, 2025 17:26

hhzhang16 pushed a commit that referenced this pull request Aug 27, 2025

docs: Bring back some missed release/0.4.0 doc changes, fix broken li…

019ccaa

…nks, add lychee link checker github action (#2482) Signed-off-by: Hannah Zhang <[email protected]>

coderabbitai bot mentioned this pull request Oct 6, 2025

docs: move all md files from components to docs #3440

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs: Bring back some missed release/0.4.0 doc changes, fix broken links, add lychee link checker github action #2482

docs: Bring back some missed release/0.4.0 doc changes, fix broken links, add lychee link checker github action #2482

Uh oh!

rmccorm4 commented Aug 17, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rmccorm4 commented Aug 18, 2025

Uh oh!

Uh oh!

rmccorm4 commented Aug 18, 2025

Uh oh!

nealvaidya commented Aug 18, 2025 •

edited

Loading

Uh oh!

rmccorm4 commented Aug 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

docs: Bring back some missed release/0.4.0 doc changes, fix broken links, add lychee link checker github action #2482

docs: Bring back some missed release/0.4.0 doc changes, fix broken links, add lychee link checker github action #2482

Uh oh!

Conversation

rmccorm4 commented Aug 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Future Follow-up:

Summary by CodeRabbit

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rmccorm4 commented Aug 18, 2025

Uh oh!

Uh oh!

rmccorm4 commented Aug 18, 2025

Uh oh!

nealvaidya commented Aug 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rmccorm4 commented Aug 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

rmccorm4 commented Aug 17, 2025 •

edited

Loading

nealvaidya commented Aug 18, 2025 •

edited

Loading