ai-dynamo · nv-anants · Jul 3, 2025 · Jun 25, 2025
diff --git a/docs/index.rst b/docs/index.rst
@@ -62,19 +62,6 @@ Dive in: Examples
 
         Presents TensorRT-LLM examples and reference implementations for deploying Large Language Models (LLMs) in various configurations.
 
-Overview
---------
-
-Dynamo is inference engine agnostic, supporting TRT-LLM, vLLM, SGLang, and others, and captures LLM-specific capabilities such as:
-
-* **Disaggregated prefill & decode inference** - Maximizes GPU throughput and facilitates trade off between throughput and latency.
-* **Dynamic GPU scheduling** - Optimizes performance based on fluctuating demand.
-* **LLM-aware request routing** - Eliminates unnecessary KV cache re-computation.
-* **Accelerated data transfer** - Reduces inference response time using NIXL.
-* **KV cache offloading** - Leverages several memory hierarchies for higher system throughput.
-
-Built in Rust for performance and in Python for extensibility, Dynamo is fully open-source
-and is driven by a transparent development approach. Check out our repo at https://github.com/ai-dynamo/.
 
 .. toctree::
    :hidden:
@@ -120,14 +107,15 @@ and is driven by a transparent development approach. Check out our repo at https
    Dynamo Cloud Kubernetes Platform <guides/dynamo_deploy/dynamo_cloud.md>
    Deploying Dynamo Inference Graphs to Kubernetes using the Dynamo Cloud Platform <guides/dynamo_deploy/operator_deployment.md>
    Manual Helm Deployment <guides/dynamo_deploy/manual_helm_deployment.md>
+   GKE Setup Guide <guides/dynamo_deploy/gke_setup.md>
    Minikube Setup Guide <guides/dynamo_deploy/minikube.md>
    Model Caching with Fluid <guides/dynamo_deploy/model_caching_with_fluid.md>
 
 .. toctree::
    :hidden:
    :caption: Benchmarking
 
-   Planner Benchmark Example <guides/planner_benchmark/benchmark_planner.md>
+   Planner Benchmark Example <guides/planner_benchmark/README.md>
 
 
 .. toctree::