You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: examples/deployments/ECS/README.md
+10-10Lines changed: 10 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,12 +1,12 @@
1
1
# Dynamo Deployment of vLLM Example on AWS ECS
2
-
## 1. ECS Cluster Setup
2
+
## 1. ECS Cluster Setup
3
3
1. Go to AWS ECS console, **Clusters** tab and click on **Create cluster** with name `dynamo-GPU`
4
4
2. Input the cluster name and choose **AWS EC2 instances** as the infrastructure. This option will create a cluster with EC2 instances to deploy containers.
5
5
3. Choose the ECS-optimized GPU AMI `Amazon Linux 2 (GPU)` (Amazon ECS–optimized), which includes NVIDIA drivers and the Docker GPU runtime out of the box.
6
6
4. Choose `g6e.2xlarge` as the **EC2 instance type** and add an `SSH Key pair` so you can log in the instance for debugging purpose.
7
7
5. Set **Root EBS volume size** as `200`
8
-
6. For the networking, use the default settings. Make sure the **security group** has
9
-
- an inbound rule which allows "All traffic" from this security group.
8
+
6. For the networking, use the default settings. Make sure the **security group** has
9
+
- an inbound rule which allows "All traffic" from this security group.
10
10
- an inbound rule for port 22 and 8000, so that you can ssh into the instance for debugging purpose
11
11
7. Select `Turn on` for **Auto-assign public IP** option.
12
12
8. Click on **Create** and a cluster will be deployed through cloudformation.
@@ -16,7 +16,7 @@ Add a task for ETCD and NATS services. A sample task definition JSON is attached
16
16
1. ETCD container
17
17
- Container name use `etcd`
18
18
- Image URL is `bitnami/etcd` and **Yes** for Essential container
19
-
- Container port
19
+
- Container port
20
20
21
21
|Container port|Protocol|Port name| App protocol|
22
22
|-|-|-|-|
@@ -26,7 +26,7 @@ Add a task for ETCD and NATS services. A sample task definition JSON is attached
26
26
2. NATS container
27
27
- Container name use `nats`
28
28
- Image URL is `nats` and **Yes** for Essential container
29
-
- Container port
29
+
- Container port
30
30
31
31
|Container port|Protocol|Port name| App protocol|
32
32
|-|-|-|-|
@@ -41,10 +41,10 @@ This task will create vLLM frontend, processors, routers and a decode worker.
41
41
Please follow steps below to create this task
42
42
- Set container name as `dynamo-frontend` and use prebuild [Dynamo container](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/vllm-runtime).
43
43
- Choose `Amazon EC2 instances` as the **Launch type** with **Task size**`2 vCPU` and `40 GB`memory
44
-
- Choose `host` as the Network mode.
44
+
- Choose `host` as the Network mode.
45
45
- Container name use `dynamo-vLLM-frontend`
46
-
- Add your Image URL (You can use the prebuild [Dynamo container](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/vllm-runtime)) and **Yes** for Essential container. It can be AWS ECR URL or Nvidia NGC URL. If using NGC URL, please also choose **Private registry authentication** and add your Secret Manager ARN or name.
47
-
- Container port
46
+
- Add your Image URL (You can use the prebuild [Dynamo container](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/containers/vllm-runtime)) and **Yes** for Essential container. It can be AWS ECR URL or Nvidia NGC URL. If using NGC URL, please also choose **Private registry authentication** and add your Secret Manager ARN or name.
47
+
- Container port
48
48
49
49
|Container port|Protocol|Port name| App protocol|
50
50
|-|-|-|-|
@@ -56,7 +56,7 @@ Please follow steps below to create this task
56
56
|-|-|-|
57
57
|ETCD_ENDPOINTS|Value|http://IP_ADDRESS:2379|
58
58
|NATS_SERVER|Value|nats://IP_ADDRESS:4222|
59
-
- Docker configuration
59
+
- Docker configuration
60
60
Add `sh,-c` in **Entry point** and `cd components/backends/vllm && python -m dynamo.frontend --router-mode kv & python3 -m dynamo.vllm --model Qwen/Qwen3-0.6B --enforce-eager` in **Command**
61
61
62
62
2. Dynamo vLLM PrefillWorker Task
@@ -69,7 +69,7 @@ Create the PrefillWorker task same as the frontend worker, except for following
69
69
You can create a service or directly run the task from the task definition
70
70
1. ETCD/NATS Task
71
71
- Choose the Fargate cluster for **Existing cluster** created in the hello world example.
72
-
- Wait for this deployment to finish, and get the **Private IP** of this task.
72
+
- Wait for this deployment to finish, and get the **Private IP** of this task.
73
73
2. Dynamo Frontend Task
74
74
- Choose the EC2 cluster for **Existing cluster** created in step 1.
75
75
- In the **Container Overrides**, use the IP for ETCD/NATS task for the `ETCD_ENDPOINTS` and `NATS_SERVER` values.
0 commit comments