-
Notifications
You must be signed in to change notification settings - Fork 753
feat: allow in-cluster perf benchmarks with a kubectl one-liner #3144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: allow in-cluster perf benchmarks with a kubectl one-liner #3144
Conversation
Signed-off-by: Hannah Zhang <[email protected]>
…-allow-in-cluster-perf-benchmarks-kubectl-one-liner Signed-off-by: Hannah Zhang <[email protected]>
…st main Signed-off-by: Hannah Zhang <[email protected]>
Signed-off-by: Hannah Zhang <[email protected]>
…ser and reasoning parser (#2999) Signed-off-by: zhongdaor <[email protected]>
…3092) Signed-off-by: tmontfort <[email protected]>
…ks (#3099) Signed-off-by: Keiven Chang <[email protected]>
Signed-off-by: Jacky <[email protected]>
…#3119) Signed-off-by: richardhuo-nv <[email protected]>
…egy structure + add gating tests for public CI (#3009) Signed-off-by: Tushar Sharma <[email protected]>
Signed-off-by: Krishnan Prashanth <[email protected]>
… device memory and vice versa (#2989) Signed-off-by: Olga Andreeva <[email protected]> Signed-off-by: oandreeva-nv <[email protected]> Co-authored-by: Ziqi Fan <[email protected]> Co-authored-by: oandreeva-nv <[email protected]>
Signed-off-by: Hannah Zhang <[email protected]>
Signed-off-by: Hannah Zhang <[email protected]>
…-allow-in-cluster-perf-benchmarks-kubectl-one-liner
WalkthroughAdds in-cluster benchmarking support: new Kubernetes utility is_running_in_cluster, updates to input validation and workflow to accept internal service URLs and normalize schemes in cluster, a Kubernetes Job manifest to run benchmarks in-cluster with PVC-backed storage, and a README documenting the end-to-end in-cluster benchmarking process. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant K8s as Kubernetes Job
participant C as Benchmark Runner (container)
participant WF as Workflow
participant V as Input Validation
participant S as Target Service
participant PVC as PVC Storage
User->>K8s: Apply benchmark_job.yaml
K8s->>C: Start pod with env and PVC mounted
C->>WF: run_endpoint_benchmark(model, endpoint, ...)
WF->>WF: Detect is_running_in_cluster()
WF->>WF: Build service_url (prefix http:// if missing)
WF->>V: validate_inputs(service_url)
V-->>WF: OK (accept internal host:port in-cluster)
WF->>S: Send benchmark requests
WF->>PVC: Write results to /data/results
WF-->>C: Complete sweep
C-->>K8s: Job completes (TTL for cleanup)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
Pre-merge checks❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Please see the documentation for more information. Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).Please share your feedback with us on this Discord post. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (10)
deploy/utils/kubernetes.py (1)
25-28: Make in-cluster detection more robust (env var fallback) and avoid repeated fs stats.Also handle cases where SA token isn’t mounted (automount disabled) by checking KUBERNETES_SERVICE_HOST. Cache the token path for clarity.
-import os +import os from pathlib import Path from typing import List @@ -def is_running_in_cluster() -> bool: - """Check if we're running inside a Kubernetes cluster""" - # Check for Kubernetes service account token (most reliable indicator) - return os.path.exists("/var/run/secrets/kubernetes.io/serviceaccount/token") +K8S_SA_TOKEN = Path("/var/run/secrets/kubernetes.io/serviceaccount/token") + +def is_running_in_cluster() -> bool: + """Return True if running inside a Kubernetes cluster.""" + # Prefer well-known env var; fall back to SA token presence + return bool(os.environ.get("KUBERNETES_SERVICE_HOST")) or K8S_SA_TOKEN.exists()Also applies to: 16-16
benchmarks/incluster/benchmark_job.yaml (1)
57-58: Add trailing newline.Satisfies yamllint new-line-at-end-of-file.
benchmarks/utils/workflow.py (2)
34-41: Factor URL normalization into a helper to avoid drift and handle scheme-less inputs cleanly.Keeps logic in one place and aligns with in-cluster behavior.
- # Handle internal service URLs by adding http:// prefix if needed - service_url = endpoint - if is_running_in_cluster() and not endpoint.lower().startswith( - ("http://", "https://") - ): - service_url = f"http://{endpoint}" + # Normalize endpoint to a usable URL (handles in-cluster scheme-less inputs) + service_url = normalize_service_url(endpoint)Add this helper near the top of the module:
from urllib.parse import urlsplit def normalize_service_url(endpoint: str) -> str: e = endpoint.strip() if e.lower().startswith(("http://", "https://")): return e if is_running_in_cluster(): return f"http://{e}" return e # Outside cluster, validation will have ensured scheme is present
84-85: Update docstring to reflect in-cluster support.- """Main benchmark workflow orchestrator for HTTP endpoints only""" + """Main benchmark workflow orchestrator for HTTP endpoints (and in-cluster internal service URLs)"""benchmarks/utils/benchmark.py (2)
130-132: Stale comment: now validates HTTP or in-cluster URLs.- # Validate that all inputs are HTTP endpoints + # Validate that inputs are HTTP endpoints or in-cluster service URLs
25-27: Action: shorten the long exception text or disable Ruff TRY003 for this file.ruff.toml enables tryceratops (select includes "TRY") and no TRY003 ignore exists, so Ruff will report TRY003 for these long exception messages.
File: benchmarks/utils/benchmark.py — lines 25–27 and 31–33.
Options:
- Shorten the exception message(s).
- Add an inline ignore on the raise line: # noqa: TRY003
- Add a per-file ignore in config: [tool.ruff.lint.extend-per-file-ignores] "benchmarks/utils/benchmark.py" = ["TRY003"].
benchmarks/incluster/README.md (4)
52-53: Path correctness: ensure envsubst reads the right file.Command assumes cwd at benchmarks/incluster. Use explicit path or add a cd.
-# Deploy the benchmark job -envsubst < benchmark_job.yaml | kubectl apply -f - +# Deploy the benchmark job +envsubst < benchmarks/incluster/benchmark_job.yaml | kubectl apply -f -
61-64: Prefer heading over emphasis for section titles (markdownlint MD036).-**Option B: One-liner deployment** +### Option B: One-liner deployment
42-45: Prefer heading over emphasis for section titles (markdownlint MD036).-**Option A: Set environment variables (recommended for multiple commands)** +### Option A: Set environment variables (recommended for multiple commands)
173-175: Fix service verification command: SERVICE_URL includes port; kubectl expects a Service name.Extract the name before the colon.
-# Verify your service URL is accessible -kubectl get svc $SERVICE_URL -n $NAMESPACE +# Verify your service exists and has endpoints +SVC_NAME="${SERVICE_URL%%:*}" +kubectl get svc "$SVC_NAME" -n "$NAMESPACE" +kubectl get endpoints "$SVC_NAME" -n "$NAMESPACE"
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
benchmarks/incluster/README.md(1 hunks)benchmarks/incluster/benchmark_job.yaml(1 hunks)benchmarks/utils/benchmark.py(1 hunks)benchmarks/utils/workflow.py(2 hunks)deploy/utils/kubernetes.py(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
benchmarks/utils/workflow.py (2)
deploy/utils/kubernetes.py (1)
is_running_in_cluster(25-28)benchmarks/utils/genai.py (1)
run_concurrency_sweep(104-118)
benchmarks/utils/benchmark.py (1)
deploy/utils/kubernetes.py (1)
is_running_in_cluster(25-28)
🪛 Ruff (0.12.2)
benchmarks/utils/benchmark.py
25-27: Avoid specifying long messages outside the exception class
(TRY003)
31-33: Avoid specifying long messages outside the exception class
(TRY003)
🪛 markdownlint-cli2 (0.17.2)
benchmarks/incluster/README.md
42-42: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
61-61: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
🪛 Checkov (3.2.334)
benchmarks/incluster/benchmark_job.yaml
[medium] 4-58: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 4-58: Minimize the admission of root containers
(CKV_K8S_23)
🪛 YAMLlint (1.37.1)
benchmarks/incluster/benchmark_job.yaml
[error] 58-58: no new line character at the end of file
(new-line-at-end-of-file)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build and Test - dynamo
🔇 Additional comments (2)
benchmarks/utils/workflow.py (1)
48-55: Signature verified — no action required.
run_concurrency_sweep is defined in benchmarks/utils/genai.py:104 as (service_url, model_name, isl, osl, stddev, output_dir) and the call at benchmarks/utils/workflow.py:48 uses matching keyword args.benchmarks/incluster/benchmark_job.yaml (1)
57-57: Verify TTL controller availability before using ttlSecondsAfterFinishedkubectl was unavailable in the verification environment (kubectl: command not found). Run one of these on a cluster with kubectl access to confirm TTL support:
- kubectl api-resources | grep -i 'ttlSecondsAfterFinished'
- kubectl api-resources | grep -i ttl
If the TTL resource/controller is not present, remove or guard the ttlSecondsAfterFinished field in benchmarks/incluster/benchmark_job.yaml (line 57) or enable the TTL controller on the cluster.
Signed-off-by: Hannah Zhang <[email protected]>
Signed-off-by: Hannah Zhang <[email protected]>
biswapanda
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left few comments
Signed-off-by: Hannah Zhang <[email protected]>
…-allow-in-cluster-perf-benchmarks-kubectl-one-liner
Signed-off-by: Hannah Zhang <[email protected]>
…arks-kubectl-one-liner
|
lgtm. |
biswapanda
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Signed-off-by: Hannah Zhang <[email protected]>
…-allow-in-cluster-perf-benchmarks-kubectl-one-liner
|
/ok to test de853cf |
Signed-off-by: Hannah Zhang <[email protected]> Signed-off-by: zhongdaor <[email protected]> Signed-off-by: tmontfort <[email protected]> Signed-off-by: Keiven Chang <[email protected]> Signed-off-by: Jacky <[email protected]> Signed-off-by: richardhuo-nv <[email protected]> Signed-off-by: Tushar Sharma <[email protected]> Signed-off-by: [email protected] <[email protected]> Signed-off-by: Krishnan Prashanth <[email protected]> Signed-off-by: Olga Andreeva <[email protected]> Signed-off-by: oandreeva-nv <[email protected]> Co-authored-by: zhongdaor-nv <[email protected]> Co-authored-by: Thomas Montfort <[email protected]> Co-authored-by: Keiven C <[email protected]> Co-authored-by: Jacky <[email protected]> Co-authored-by: Richard Huo <[email protected]> Co-authored-by: Tushar Sharma <[email protected]> Co-authored-by: Tzu-Ling Kan <[email protected]> Co-authored-by: KrishnanPrash <[email protected]> Co-authored-by: Olga Andreeva <[email protected]> Co-authored-by: Ziqi Fan <[email protected]> Co-authored-by: oandreeva-nv <[email protected]> Signed-off-by: Jason Zhou <[email protected]>
Signed-off-by: Hannah Zhang <[email protected]> Signed-off-by: zhongdaor <[email protected]> Signed-off-by: tmontfort <[email protected]> Signed-off-by: Keiven Chang <[email protected]> Signed-off-by: Jacky <[email protected]> Signed-off-by: richardhuo-nv <[email protected]> Signed-off-by: Tushar Sharma <[email protected]> Signed-off-by: [email protected] <[email protected]> Signed-off-by: Krishnan Prashanth <[email protected]> Signed-off-by: Olga Andreeva <[email protected]> Signed-off-by: oandreeva-nv <[email protected]> Co-authored-by: zhongdaor-nv <[email protected]> Co-authored-by: Thomas Montfort <[email protected]> Co-authored-by: Keiven C <[email protected]> Co-authored-by: Jacky <[email protected]> Co-authored-by: Richard Huo <[email protected]> Co-authored-by: Tushar Sharma <[email protected]> Co-authored-by: Tzu-Ling Kan <[email protected]> Co-authored-by: KrishnanPrash <[email protected]> Co-authored-by: Olga Andreeva <[email protected]> Co-authored-by: Ziqi Fan <[email protected]> Co-authored-by: oandreeva-nv <[email protected]>
Signed-off-by: Hannah Zhang <[email protected]> Signed-off-by: zhongdaor <[email protected]> Signed-off-by: tmontfort <[email protected]> Signed-off-by: Keiven Chang <[email protected]> Signed-off-by: Jacky <[email protected]> Signed-off-by: richardhuo-nv <[email protected]> Signed-off-by: Tushar Sharma <[email protected]> Signed-off-by: [email protected] <[email protected]> Signed-off-by: Krishnan Prashanth <[email protected]> Signed-off-by: Olga Andreeva <[email protected]> Signed-off-by: oandreeva-nv <[email protected]> Co-authored-by: zhongdaor-nv <[email protected]> Co-authored-by: Thomas Montfort <[email protected]> Co-authored-by: Keiven C <[email protected]> Co-authored-by: Jacky <[email protected]> Co-authored-by: Richard Huo <[email protected]> Co-authored-by: Tushar Sharma <[email protected]> Co-authored-by: Tzu-Ling Kan <[email protected]> Co-authored-by: KrishnanPrash <[email protected]> Co-authored-by: Olga Andreeva <[email protected]> Co-authored-by: Ziqi Fan <[email protected]> Co-authored-by: oandreeva-nv <[email protected]> Signed-off-by: Jason Zhou <[email protected]>
Signed-off-by: Hannah Zhang <[email protected]> Signed-off-by: zhongdaor <[email protected]> Signed-off-by: tmontfort <[email protected]> Signed-off-by: Keiven Chang <[email protected]> Signed-off-by: Jacky <[email protected]> Signed-off-by: richardhuo-nv <[email protected]> Signed-off-by: Tushar Sharma <[email protected]> Signed-off-by: [email protected] <[email protected]> Signed-off-by: Krishnan Prashanth <[email protected]> Signed-off-by: Olga Andreeva <[email protected]> Signed-off-by: oandreeva-nv <[email protected]> Co-authored-by: zhongdaor-nv <[email protected]> Co-authored-by: Thomas Montfort <[email protected]> Co-authored-by: Keiven C <[email protected]> Co-authored-by: Jacky <[email protected]> Co-authored-by: Richard Huo <[email protected]> Co-authored-by: Tushar Sharma <[email protected]> Co-authored-by: Tzu-Ling Kan <[email protected]> Co-authored-by: KrishnanPrash <[email protected]> Co-authored-by: Olga Andreeva <[email protected]> Co-authored-by: Ziqi Fan <[email protected]> Co-authored-by: oandreeva-nv <[email protected]>
Signed-off-by: Hannah Zhang <[email protected]> Signed-off-by: zhongdaor <[email protected]> Signed-off-by: tmontfort <[email protected]> Signed-off-by: Keiven Chang <[email protected]> Signed-off-by: Jacky <[email protected]> Signed-off-by: richardhuo-nv <[email protected]> Signed-off-by: Tushar Sharma <[email protected]> Signed-off-by: [email protected] <[email protected]> Signed-off-by: Krishnan Prashanth <[email protected]> Signed-off-by: Olga Andreeva <[email protected]> Signed-off-by: oandreeva-nv <[email protected]> Co-authored-by: zhongdaor-nv <[email protected]> Co-authored-by: Thomas Montfort <[email protected]> Co-authored-by: Keiven C <[email protected]> Co-authored-by: Jacky <[email protected]> Co-authored-by: Richard Huo <[email protected]> Co-authored-by: Tushar Sharma <[email protected]> Co-authored-by: Tzu-Ling Kan <[email protected]> Co-authored-by: KrishnanPrash <[email protected]> Co-authored-by: Olga Andreeva <[email protected]> Co-authored-by: Ziqi Fan <[email protected]> Co-authored-by: oandreeva-nv <[email protected]> Signed-off-by: Jason Zhou <[email protected]>
Signed-off-by: Hannah Zhang <[email protected]> Signed-off-by: zhongdaor <[email protected]> Signed-off-by: tmontfort <[email protected]> Signed-off-by: Keiven Chang <[email protected]> Signed-off-by: Jacky <[email protected]> Signed-off-by: richardhuo-nv <[email protected]> Signed-off-by: Tushar Sharma <[email protected]> Signed-off-by: [email protected] <[email protected]> Signed-off-by: Krishnan Prashanth <[email protected]> Signed-off-by: Olga Andreeva <[email protected]> Signed-off-by: oandreeva-nv <[email protected]> Co-authored-by: zhongdaor-nv <[email protected]> Co-authored-by: Thomas Montfort <[email protected]> Co-authored-by: Keiven C <[email protected]> Co-authored-by: Jacky <[email protected]> Co-authored-by: Richard Huo <[email protected]> Co-authored-by: Tushar Sharma <[email protected]> Co-authored-by: Tzu-Ling Kan <[email protected]> Co-authored-by: KrishnanPrash <[email protected]> Co-authored-by: Olga Andreeva <[email protected]> Co-authored-by: Ziqi Fan <[email protected]> Co-authored-by: oandreeva-nv <[email protected]> Signed-off-by: Jason Zhou <[email protected]>
Signed-off-by: Hannah Zhang <[email protected]> Signed-off-by: zhongdaor <[email protected]> Signed-off-by: tmontfort <[email protected]> Signed-off-by: Keiven Chang <[email protected]> Signed-off-by: Jacky <[email protected]> Signed-off-by: richardhuo-nv <[email protected]> Signed-off-by: Tushar Sharma <[email protected]> Signed-off-by: [email protected] <[email protected]> Signed-off-by: Krishnan Prashanth <[email protected]> Signed-off-by: Olga Andreeva <[email protected]> Signed-off-by: oandreeva-nv <[email protected]> Co-authored-by: zhongdaor-nv <[email protected]> Co-authored-by: Thomas Montfort <[email protected]> Co-authored-by: Keiven C <[email protected]> Co-authored-by: Jacky <[email protected]> Co-authored-by: Richard Huo <[email protected]> Co-authored-by: Tushar Sharma <[email protected]> Co-authored-by: Tzu-Ling Kan <[email protected]> Co-authored-by: KrishnanPrash <[email protected]> Co-authored-by: Olga Andreeva <[email protected]> Co-authored-by: Ziqi Fan <[email protected]> Co-authored-by: oandreeva-nv <[email protected]>
Signed-off-by: Hannah Zhang <[email protected]> Signed-off-by: zhongdaor <[email protected]> Signed-off-by: tmontfort <[email protected]> Signed-off-by: Keiven Chang <[email protected]> Signed-off-by: Jacky <[email protected]> Signed-off-by: richardhuo-nv <[email protected]> Signed-off-by: Tushar Sharma <[email protected]> Signed-off-by: [email protected] <[email protected]> Signed-off-by: Krishnan Prashanth <[email protected]> Signed-off-by: Olga Andreeva <[email protected]> Signed-off-by: oandreeva-nv <[email protected]> Co-authored-by: zhongdaor-nv <[email protected]> Co-authored-by: Thomas Montfort <[email protected]> Co-authored-by: Keiven C <[email protected]> Co-authored-by: Jacky <[email protected]> Co-authored-by: Richard Huo <[email protected]> Co-authored-by: Tushar Sharma <[email protected]> Co-authored-by: Tzu-Ling Kan <[email protected]> Co-authored-by: KrishnanPrash <[email protected]> Co-authored-by: Olga Andreeva <[email protected]> Co-authored-by: Ziqi Fan <[email protected]> Co-authored-by: oandreeva-nv <[email protected]> Signed-off-by: Kyle H <[email protected]>

Overview:
Details:
Where should the reviewer start?
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
Summary by CodeRabbit