Skip to content
Merged
Show file tree
Hide file tree
Changes from 19 commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
68714fd
rename
tedzhouhk Sep 19, 2025
1c41d56
stage
tedzhouhk Sep 19, 2025
4a73121
Merge branch 'main' of https://github.com/ai-dynamo/dynamo into hzhou…
tedzhouhk Sep 19, 2025
db737f6
stage
tedzhouhk Sep 19, 2025
2a42aea
add test
tedzhouhk Sep 20, 2025
5b88f60
add test
tedzhouhk Sep 20, 2025
330a611
delete
tedzhouhk Sep 20, 2025
684f7c9
bug fixes
tedzhouhk Sep 22, 2025
a895bc2
enable grove
tedzhouhk Sep 22, 2025
a6f70a5
bug
tedzhouhk Sep 22, 2025
0b64cda
remove isl/osl in planner arg
tedzhouhk Sep 22, 2025
4fb5294
update doc and plot script
tedzhouhk Sep 23, 2025
18c0fae
pc
tedzhouhk Sep 23, 2025
bad3f5c
Merge branch 'main' of https://github.com/ai-dynamo/dynamo into hzhou…
tedzhouhk Sep 23, 2025
76e77aa
pc
tedzhouhk Sep 23, 2025
88105cb
doclink
tedzhouhk Sep 23, 2025
44bc765
Merge branch 'main' of https://github.com/ai-dynamo/dynamo into hzhou…
tedzhouhk Sep 23, 2025
23dd336
fix test
tedzhouhk Sep 23, 2025
96d2978
fix conc bug
tedzhouhk Sep 23, 2025
b016918
fix doc, add todo
tedzhouhk Sep 23, 2025
5440e91
Update benchmarks/profiler/profile_sla.py
tedzhouhk Sep 23, 2025
77d2d66
Update benchmarks/profiler/profile_sla.py
tedzhouhk Sep 23, 2025
fe05ce5
Update benchmarks/profiler/profile_sla.py
tedzhouhk Sep 23, 2025
e9c9c71
Update benchmarks/profiler/profile_sla.py
tedzhouhk Sep 23, 2025
2f61d82
Update benchmarks/profiler/profile_sla.py
tedzhouhk Sep 23, 2025
95c718b
Update benchmarks/profiler/profile_sla.py
tedzhouhk Sep 23, 2025
a1ec4dc
Update benchmarks/profiler/utils/plot.py
tedzhouhk Sep 23, 2025
da99c64
doc
tedzhouhk Sep 23, 2025
56241ae
Merge branch 'hzhou/sglang-dsr1-sweep' of https://github.com/ai-dynam…
tedzhouhk Sep 23, 2025
06d9bd6
Update docs/benchmarks/pre_deployment_profiling.md
tedzhouhk Sep 23, 2025
b562984
pc
tedzhouhk Sep 23, 2025
4f2f905
Update benchmarks/profiler/profile_sla.py
tedzhouhk Sep 23, 2025
fb897c1
Update benchmarks/profiler/utils/plot.py
tedzhouhk Sep 23, 2025
ff72609
better code
tedzhouhk Sep 24, 2025
5f428dd
Merge branch 'hzhou/sglang-dsr1-sweep' of https://github.com/ai-dynam…
tedzhouhk Sep 24, 2025
3acb775
pc
tedzhouhk Sep 24, 2025
fbac34f
add todo
tedzhouhk Sep 24, 2025
aead4d3
mypy
tedzhouhk Sep 24, 2025
82595a9
update test
tedzhouhk Sep 24, 2025
17d3349
pc
tedzhouhk Sep 24, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 61 additions & 0 deletions benchmarks/profiler/deploy/profile_sla_moe_job.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
apiVersion: batch/v1
kind: Job
metadata:
name: profile-sla
namespace: ${NAMESPACE}
spec:
template:
spec:
serviceAccountName: dynamo-sa
containers:
- name: profile-sla
image: ${DOCKER_IMAGE}
resources:
requests:
cpu: "32"
memory: "50Gi"
env:
- name: HUGGING_FACE_HUB_TOKEN
valueFrom:
secretKeyRef:
name: hf-token-secret
key: HF_TOKEN
- name: NATS_SERVER
value: nats://${NAMESPACE}-nats:4222
- name: ETCD_ENDPOINTS
value: ${NAMESPACE}-etcd:2379
workingDir: /sgl-workspace/dynamo
command: ["python", "-m", "benchmarks.profiler.profile_sla"]
args:
- --config
- /sgl-workspace/dynamo/recipes/deepseek-r1/sglang-wideep/tep16p-dep16d-disagg.yaml
- --output-dir
- /data/profiling_results
- --namespace
- ${NAMESPACE}
- --backend
- sglang
- --is-moe-model
- --min-num-gpus-per-engine
- "8"
- --max-num-gpus-per-engine
- "16"
- --isl
- "3000"
- --osl
- "150"
- --ttft
- "200"
- --itl
- "20"
volumeMounts:
- name: output-volume
mountPath: /data
restartPolicy: Never
volumes:
- name: output-volume
persistentVolumeClaim:
claimName: dynamo-pvc
backoffLimit: 0
7 changes: 7 additions & 0 deletions benchmarks/profiler/profile_endpoint.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,12 @@
default=8,
help="interpolation granularity for the results",
)
parser.add_argument(
"--attention_dp_size",
type=int,
default=1,
help="attention dp size of the endpoint for MoE models",
)
args = parser.parse_args()

os.makedirs(args.work_dir, exist_ok=True)
Expand All @@ -105,6 +111,7 @@
args.max_kv_tokens,
args.max_context_length,
args.interpolation_granularity,
args.attention_dp_size,
)
else:
raise ValueError(f"Invalid mode: {args.mode}")
Loading
Loading