-
Notifications
You must be signed in to change notification settings - Fork 753
feat: decouple dynamo k8s setup from additional benchmarking requirements #2973
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: decouple dynamo k8s setup from additional benchmarking requirements #2973
Conversation
…s script that only applies manifests and adds needed HF tokens Signed-off-by: Hannah Zhang <[email protected]>
Signed-off-by: Hannah Zhang <[email protected]>
…-profilerplanner-related-k8s-deployment-setup-should-be-on
WalkthroughReplaces the old Kubernetes namespace setup script with a new benchmarking resource setup script, updates related documentation and help text to reference the new workflow and the main platform installation guide, and adjusts benchmarking/profiling docs accordingly. No runtime logic changes to benchmarking CLI; one new script added, one script removed. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant Script as setup_benchmarking_resources.sh
participant Kubectl as kubectl
participant K8s as Kubernetes API
User->>Script: Run with NAMESPACE [+ HF_TOKEN]
Script->>Kubectl: Check kubectl availability
Script->>Kubectl: Verify namespace exists
Kubectl-->>Script: Namespace status
Script->>K8s: Apply manifests (SA/Role/RoleBinding/PVC) via envsubst
alt HF_TOKEN provided
Script->>K8s: Create/Update Secret hf-token-secret
end
opt requirements.txt present
Script->>Script: Install Python deps (pip/pip3 or fallback)
end
Script->>K8s: Verify SA, PVC (and Secret if set)
K8s-->>Script: Resource statuses
Script-->>User: Success / error summary
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Pre-merge checks (1 passed, 2 warnings)❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
Poem
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
docs/benchmarks/pre_deployment_profiling.md (2)
141-144: Path mismatch: use /data path consistently with injector/PVC mount.You inject configs under /data, but set DGD_CONFIG_FILE to /workspace/...; that will break the job if it expects the PVC path.
- export DGD_CONFIG_FILE=/workspace/profiling_results/disagg.yaml # or your custom path + export DGD_CONFIG_FILE=/data/configs/disagg.yaml # or your injected PVC path under /data
155-156: Undefined variable DYNAMO_HOME in path.Use a repo‑relative path to avoid confusion.
-Edit `$DYNAMO_HOME/benchmarks/profiler/deploy/profile_sla_job.yaml` to set the target ISL, OSL, TTFT, and ITL. +Edit `benchmarks/profiler/deploy/profile_sla_job.yaml` to set the target ISL, OSL, TTFT, and ITL.
🧹 Nitpick comments (7)
benchmarks/benchmark.sh (1)
101-103: Help text update looks good; consider adding the full script path.To reduce ambiguity for new users, reference the full path: deploy/utils/setup_benchmarking_resources.sh.
- - For Dynamo deployment setup, follow the main installation guide at deploy/README.md - to install the platform, then use setup_benchmarking_resources.sh for benchmarking resources. + - For Dynamo deployment setup, follow the main installation guide at deploy/README.md + to install the platform, then use deploy/utils/setup_benchmarking_resources.sh for benchmarking resources.docs/benchmarks/pre_deployment_profiling.md (1)
190-191: Incorrect reference to when the PVC is created.PVC is created by the setup script, not “Step 2”.
-After the profiling job completes successfully, the results are stored in the persistent volume claim (PVC) created during Step 2. +After the profiling job completes successfully, the results are stored in the persistent volume claim (PVC) created by the setup script.deploy/utils/setup_benchmarking_resources.sh (3)
10-11: REPO_ROOT is defined but unused.Remove it to silence SC2034 and avoid confusion.
-REPO_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
47-47: Add a help/usage switch.Since usage() exists, wire -h/--help.
+if [[ "${1:-}" == "-h" || "${1:-}" == "--help" ]]; then + usage + exit 0 +fi
105-111: Verify all created RBAC objects too.Optional but helpful for debugging.
kubectl get serviceaccount dynamo-sa -n "$NAMESPACE" >/dev/null && ok "ServiceAccount dynamo-sa exists" || err "ServiceAccount dynamo-sa not found" kubectl get pvc dynamo-pvc -n "$NAMESPACE" >/dev/null && ok "PVC dynamo-pvc exists" || err "PVC dynamo-pvc not found" +kubectl get role dynamo-role -n "$NAMESPACE" >/dev/null && ok "Role dynamo-role exists" || err "Role dynamo-role not found" +kubectl get rolebinding dynamo-binding -n "$NAMESPACE" >/dev/null && ok "RoleBinding dynamo-binding exists" || err "RoleBinding dynamo-binding not found"deploy/utils/README.md (2)
51-58: Include RoleBinding in the verification step.Completes the checklist.
kubectl get serviceaccount dynamo-sa -n $NAMESPACE kubectl get pvc dynamo-pvc -n $NAMESPACE +kubectl get rolebinding dynamo-binding -n $NAMESPACE
33-42: Document non-interactive mode for setup script.Reflect FORCE/YES option to avoid prompts in CI.
-After setting up Dynamo Cloud, use this script to prepare your namespace with the additional resources needed for benchmarking and profiling workflows: +After setting up Dynamo Cloud, use this script to prepare your namespace with the additional resources needed for benchmarking and profiling workflows. In CI/non-interactive environments, set FORCE=1 to skip prompts:-deploy/utils/setup_benchmarking_resources.sh +FORCE=1 deploy/utils/setup_benchmarking_resources.sh
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
benchmarks/benchmark.sh(1 hunks)deploy/utils/README.md(2 hunks)deploy/utils/setup_benchmarking_resources.sh(1 hunks)deploy/utils/setup_k8s_namespace.sh(0 hunks)docs/benchmarks/benchmarking.md(1 hunks)docs/benchmarks/pre_deployment_profiling.md(1 hunks)
💤 Files with no reviewable changes (1)
- deploy/utils/setup_k8s_namespace.sh
🧰 Additional context used
🪛 Shellcheck (0.10.0)
deploy/utils/setup_benchmarking_resources.sh
[warning] 10-10: REPO_ROOT appears unused. Verify use (or export if used externally).
(SC2034)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Build and Test - sglang
- GitHub Check: Build and Test - dynamo
🔇 Additional comments (2)
docs/benchmarks/pre_deployment_profiling.md (1)
92-92: Good reframe of the setup step.Reads clearly and matches the new workflow.
docs/benchmarks/benchmarking.md (1)
40-41: Prerequisites update is clear and consistent with the new flow.No issues found.
Signed-off-by: Hannah Zhang <[email protected]>
Signed-off-by: Hannah Zhang <[email protected]>
…deployment-setup-should-be-on
…ents (#2973) Signed-off-by: Hannah Zhang <[email protected]> Signed-off-by: zhongdaor <[email protected]>
…ents (#2973) Signed-off-by: Hannah Zhang <[email protected]> Signed-off-by: Biswa Panda <[email protected]>
…ents (#2973) Signed-off-by: Hannah Zhang <[email protected]> Signed-off-by: Kristen Kelleher <[email protected]>
Overview:
Remove the Dynamo k8s setup from the script to install additional resources needed for k8s benchmarking.
Details:
Where should the reviewer start?
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
Summary by CodeRabbit
New Features
Documentation
Chores