Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
--wip--
  • Loading branch information
biswapanda committed Jul 28, 2025
commit 199e0e79df8ab2adb9a5a0c902708d7d672864dd
2 changes: 1 addition & 1 deletion docs/architecture/distributed_runtime.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ For example, the deployment configuration `examples/llm/configs/disagg.yaml` hav
- `Processor`: When a new request arrives, `Processor` applies the chat template and perform the tokenization. Then, it route the request to the `VllmWorker`.
- `VllmWorker` and `PrefillWorker`: Perform the actual decode and prefill computation.

Since the four workers are deployed in different processes, each of them have their own `DistributedRuntime`. Within their own `DistributedRuntime`, they all have their own `Namespace`s named `dynamo`. Then, under their own `dynamo` namespace, they have their own `Component`s named `Frontend/Processor/VllmWorker/PrefillWorker`. Lastly, for the `Endpoint`, `Frontend` has no `Endpoints`, `Processor` and `VllmWorker` each has a `generate` endpoint, and `PrefillWorker` has a placeholder `mock` endpoint. Their `DistributedRuntime`s and `Namespace`s are set in the `@service` decorators in `examples/llm/components/<frontend/processor/worker/prefill_worker>.py`. Their `Component`s are set by their name in `/deploy/dynamo/sdk/src/dynamo/sdk/cli/serve_dynamo.py`. Their `Endpoint`s are set by the `@endpoint` decorators in `examples/llm/components/<frontend/processor/worker/prefill_worker>.py`.
Since the four workers are deployed in different processes, each of them have their own `DistributedRuntime`. Within their own `DistributedRuntime`, they all have their own `Namespace`s named `dynamo`. Then, under their own `dynamo` namespace, they have their own `Component`s named `Frontend/Processor/VllmWorker/PrefillWorker`. Lastly, for the `Endpoint`, `Frontend` has no `Endpoints`, `Processor` and `VllmWorker` each has a `generate` endpoint, and `PrefillWorker` has a placeholder `mock` endpoint.

## Initialization

Expand Down
4 changes: 0 additions & 4 deletions docs/dynamo_glossary.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,6 @@
**Dynamo Cloud** - A Kubernetes platform providing managed deployment experience for Dynamo inference graphs.

## E
**@endpoint** - A Python decorator used to define service endpoints within a Dynamo component.

**Endpoint** - A specific network-accessible API within a Dynamo component, such as `generate` or `load_metrics`.

## F
Expand Down Expand Up @@ -70,8 +68,6 @@
**RDMA (Remote Direct Memory Access)** - Technology that allows direct memory access between distributed systems, used for efficient KV cache transfers.

## S
**@service** - Python decorator used to define a Dynamo service class.

**SGLang** - Fast LLM inference framework with native embedding support and RadixAttention.

## T
Expand Down
Loading