Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 2 additions & 14 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -62,19 +62,6 @@ Dive in: Examples

Presents TensorRT-LLM examples and reference implementations for deploying Large Language Models (LLMs) in various configurations.

Overview
--------

Dynamo is inference engine agnostic, supporting TRT-LLM, vLLM, SGLang, and others, and captures LLM-specific capabilities such as:

* **Disaggregated prefill & decode inference** - Maximizes GPU throughput and facilitates trade off between throughput and latency.
* **Dynamic GPU scheduling** - Optimizes performance based on fluctuating demand.
* **LLM-aware request routing** - Eliminates unnecessary KV cache re-computation.
* **Accelerated data transfer** - Reduces inference response time using NIXL.
* **KV cache offloading** - Leverages several memory hierarchies for higher system throughput.

Built in Rust for performance and in Python for extensibility, Dynamo is fully open-source
and is driven by a transparent development approach. Check out our repo at https://github.com/ai-dynamo/.

.. toctree::
:hidden:
Expand Down Expand Up @@ -120,14 +107,15 @@ and is driven by a transparent development approach. Check out our repo at https
Dynamo Cloud Kubernetes Platform <guides/dynamo_deploy/dynamo_cloud.md>
Deploying Dynamo Inference Graphs to Kubernetes using the Dynamo Cloud Platform <guides/dynamo_deploy/operator_deployment.md>
Manual Helm Deployment <guides/dynamo_deploy/manual_helm_deployment.md>
GKE Setup Guide <guides/dynamo_deploy/gke_setup.md>
Minikube Setup Guide <guides/dynamo_deploy/minikube.md>
Model Caching with Fluid <guides/dynamo_deploy/model_caching_with_fluid.md>

.. toctree::
:hidden:
:caption: Benchmarking

Planner Benchmark Example <guides/planner_benchmark/benchmark_planner.md>
Planner Benchmark Example <guides/planner_benchmark/README.md>


.. toctree::
Expand Down
Loading