feat: allow in-cluster perf benchmarks with a kubectl one-liner #3144

hhzhang16 · 2025-09-19T18:12:59Z

Overview:

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

closes GitHub issue: #xxx

Summary by CodeRabbit

New Features
- Added support for running benchmarks directly inside Kubernetes clusters using internal service URLs and cluster DNS.
- Improved input handling to accept in-cluster endpoints and auto-scheme prefixing for seamless execution.
- Provided a ready-to-run Kubernetes Job template for configurable, isolated benchmark runs with resource controls and automatic cleanup.
Documentation
- Added an in-cluster benchmarking guide with prerequisites, quick start, configuration via environment variables, monitoring/debugging steps, result retrieval, and troubleshooting.
- Documented result storage on a persistent volume with a structure consistent with local benchmarking.

Signed-off-by: Hannah Zhang <[email protected]>

…-allow-in-cluster-perf-benchmarks-kubectl-one-liner Signed-off-by: Hannah Zhang <[email protected]>

…st main Signed-off-by: Hannah Zhang <[email protected]>

Signed-off-by: Hannah Zhang <[email protected]>

…ser and reasoning parser (#2999) Signed-off-by: zhongdaor <[email protected]>

…3092) Signed-off-by: tmontfort <[email protected]>

…ks (#3099) Signed-off-by: Keiven Chang <[email protected]>

Signed-off-by: Jacky <[email protected]>

…#3119) Signed-off-by: richardhuo-nv <[email protected]>

…egy structure + add gating tests for public CI (#3009) Signed-off-by: Tushar Sharma <[email protected]>

Signed-off-by: [email protected] <[email protected]>

Signed-off-by: Krishnan Prashanth <[email protected]>

… device memory and vice versa (#2989) Signed-off-by: Olga Andreeva <[email protected]> Signed-off-by: oandreeva-nv <[email protected]> Co-authored-by: Ziqi Fan <[email protected]> Co-authored-by: oandreeva-nv <[email protected]>

Signed-off-by: Hannah Zhang <[email protected]>

…-allow-in-cluster-perf-benchmarks-kubectl-one-liner

copy-pr-bot · 2025-09-19T18:13:03Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2025-09-19T18:21:02Z

Walkthrough

Adds in-cluster benchmarking support: new Kubernetes utility is_running_in_cluster, updates to input validation and workflow to accept internal service URLs and normalize schemes in cluster, a Kubernetes Job manifest to run benchmarks in-cluster with PVC-backed storage, and a README documenting the end-to-end in-cluster benchmarking process.

Changes

Cohort / File(s)	Summary
In-cluster docs `benchmarks/incluster/README.md`	New documentation detailing prerequisites, deployment, execution, result retrieval, monitoring, and troubleshooting for running Dynamo benchmarks inside Kubernetes using internal service URLs and PVC storage.
K8s Job manifest `benchmarks/incluster/benchmark_job.yaml`	Adds a batch/v1 Job (dynamo-benchmark) that runs the benchmark runner with configurable env vars, mounts PVC `dynamo-pvc` at `/data`, uses `dynamo-sa`, references secrets, and executes `python3 -m benchmarks.utils.benchmark` with in-cluster inputs.
Benchmark validation + workflow `benchmarks/utils/benchmark.py`, `benchmarks/utils/workflow.py`	Enables cluster-aware input handling: accepts internal host:port endpoints in-cluster, prefixes `http://` when missing, updates logging and parameter wiring to use `service_url`, and passes `stddev` and `output_dir/label` to concurrency sweep.
Kubernetes utilities `deploy/utils/kubernetes.py`	Introduces `is_running_in_cluster()` to detect Kubernetes environment by presence of the service account token file at `/var/run/secrets/kubernetes.io/serviceaccount/token`.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant K8s as Kubernetes Job
  participant C as Benchmark Runner (container)
  participant WF as Workflow
  participant V as Input Validation
  participant S as Target Service
  participant PVC as PVC Storage

  User->>K8s: Apply benchmark_job.yaml
  K8s->>C: Start pod with env and PVC mounted
  C->>WF: run_endpoint_benchmark(model, endpoint, ...)
  WF->>WF: Detect is_running_in_cluster()
  WF->>WF: Build service_url (prefix http:// if missing)
  WF->>V: validate_inputs(service_url)
  V-->>WF: OK (accept internal host:port in-cluster)
  WF->>S: Send benchmark requests
  WF->>PVC: Write results to /data/results
  WF-->>C: Complete sweep
  C-->>K8s: Job completes (TTL for cleanup)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

feat: decouple dynamo k8s setup from additional benchmarking requirements #2973 — Also introduces/uses deploy/utils/kubernetes.py:is_running_in_cluster and aligns with in-cluster benchmarking resources.
feat: remove kubectl dependencies from benchmarking #3098 — Touches benchmarks/utils/benchmark.py and workflow paths but restricts inputs to HTTP endpoints; potentially conflicts with this PR’s in-cluster URL acceptance.

Poem

In pods I hop, a cluster-bound sprite,
Nibbling on endpoints, day and night.
PVC burrow where results now dwell,
I chart the sweeps—oh, data to tell!
With kube-side trails and service smell,
Benchmarks thump—all is well. 🐇📊

Pre-merge checks

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description Check	⚠️ Warning	The PR description contains only the template headings with no substantive content and a placeholder Related Issues entry ("#xxx"), so it does not describe the changes, rationale, or reviewer guidance required by the repository's description_template and is therefore largely incomplete.	Please populate the template: add an Overview that summarizes the goal (enable in-cluster perf benchmarks and the kubectl one-liner), a Details section that lists and explains key file changes (e.g., benchmarks/incluster/README.md, benchmarks/incluster/benchmark_job.yaml, benchmarks/utils/{benchmark.py,workflow.py}, deploy/utils/kubernetes.py) and the rationale, and a "Where should the reviewer start?" section pointing to those files and any tests or commands to run; replace the placeholder Related Issues entry with the actual issue number and include brief testing instructions and expected outcomes so reviewers can validate the change.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title "feat: allow in-cluster perf benchmarks with a kubectl one-liner" directly summarizes the primary change: enabling in-cluster performance benchmarking and providing a kubectl one-liner deployment workflow; it is concise, specific, and related to the README, Job manifest, workflow, and cluster-detection helper introduced in the changeset.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (10)

deploy/utils/kubernetes.py (1)
25-28: Make in-cluster detection more robust (env var fallback) and avoid repeated fs stats.

Also handle cases where SA token isn’t mounted (automount disabled) by checking KUBERNETES_SERVICE_HOST. Cache the token path for clarity.
-import os
+import os
 from pathlib import Path
 from typing import List
@@
-def is_running_in_cluster() -> bool:
-    """Check if we're running inside a Kubernetes cluster"""
-    # Check for Kubernetes service account token (most reliable indicator)
-    return os.path.exists("/var/run/secrets/kubernetes.io/serviceaccount/token")
+K8S_SA_TOKEN = Path("/var/run/secrets/kubernetes.io/serviceaccount/token")
+
+def is_running_in_cluster() -> bool:
+    """Return True if running inside a Kubernetes cluster."""
+    # Prefer well-known env var; fall back to SA token presence
+    return bool(os.environ.get("KUBERNETES_SERVICE_HOST")) or K8S_SA_TOKEN.exists()
Also applies to: 16-16
benchmarks/incluster/benchmark_job.yaml (1)

57-58: Add trailing newline.

Satisfies yamllint new-line-at-end-of-file.
benchmarks/utils/workflow.py (2)
34-41: Factor URL normalization into a helper to avoid drift and handle scheme-less inputs cleanly.

Keeps logic in one place and aligns with in-cluster behavior.
-    # Handle internal service URLs by adding http:// prefix if needed
-    service_url = endpoint
-    if is_running_in_cluster() and not endpoint.lower().startswith(
-        ("http://", "https://")
-    ):
-        service_url = f"http://{endpoint}"
+    # Normalize endpoint to a usable URL (handles in-cluster scheme-less inputs)
+    service_url = normalize_service_url(endpoint)
Add this helper near the top of the module:
from urllib.parse import urlsplit

def normalize_service_url(endpoint: str) -> str:
    e = endpoint.strip()
    if e.lower().startswith(("http://", "https://")):
        return e
    if is_running_in_cluster():
        return f"http://{e}"
    return e  # Outside cluster, validation will have ensured scheme is present
84-85: Update docstring to reflect in-cluster support.
-    """Main benchmark workflow orchestrator for HTTP endpoints only"""
+    """Main benchmark workflow orchestrator for HTTP endpoints (and in-cluster internal service URLs)"""
benchmarks/utils/benchmark.py (2)
130-132: Stale comment: now validates HTTP or in-cluster URLs.
-        # Validate that all inputs are HTTP endpoints
+        # Validate that inputs are HTTP endpoints or in-cluster service URLs
25-27: Action: shorten the long exception text or disable Ruff TRY003 for this file.

ruff.toml enables tryceratops (select includes "TRY") and no TRY003 ignore exists, so Ruff will report TRY003 for these long exception messages.

File: benchmarks/utils/benchmark.py — lines 25–27 and 31–33.

Options:

Shorten the exception message(s).

Add an inline ignore on the raise line: # noqa: TRY003

Add a per-file ignore in config: [tool.ruff.lint.extend-per-file-ignores] "benchmarks/utils/benchmark.py" = ["TRY003"].
benchmarks/incluster/README.md (4)
52-53: Path correctness: ensure envsubst reads the right file.

Command assumes cwd at benchmarks/incluster. Use explicit path or add a cd.
-# Deploy the benchmark job
-envsubst < benchmark_job.yaml | kubectl apply -f -
+# Deploy the benchmark job
+envsubst < benchmarks/incluster/benchmark_job.yaml | kubectl apply -f -
61-64: Prefer heading over emphasis for section titles (markdownlint MD036).
-**Option B: One-liner deployment**
+### Option B: One-liner deployment
42-45: Prefer heading over emphasis for section titles (markdownlint MD036).
-**Option A: Set environment variables (recommended for multiple commands)**
+### Option A: Set environment variables (recommended for multiple commands)
173-175: Fix service verification command: SERVICE_URL includes port; kubectl expects a Service name.

Extract the name before the colon.
-# Verify your service URL is accessible
-kubectl get svc $SERVICE_URL -n $NAMESPACE
+# Verify your service exists and has endpoints
+SVC_NAME="${SERVICE_URL%%:*}"
+kubectl get svc "$SVC_NAME" -n "$NAMESPACE"
+kubectl get endpoints "$SVC_NAME" -n "$NAMESPACE"

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2d39f1b and e7ed272.

📒 Files selected for processing (5)

benchmarks/incluster/README.md (1 hunks)
benchmarks/incluster/benchmark_job.yaml (1 hunks)
benchmarks/utils/benchmark.py (1 hunks)
benchmarks/utils/workflow.py (2 hunks)
deploy/utils/kubernetes.py (2 hunks)

🧰 Additional context used

🧬 Code graph analysis (2)

benchmarks/utils/workflow.py (2)

deploy/utils/kubernetes.py (1)

is_running_in_cluster (25-28)

benchmarks/utils/genai.py (1)

run_concurrency_sweep (104-118)

benchmarks/utils/benchmark.py (1)

deploy/utils/kubernetes.py (1)

is_running_in_cluster (25-28)

🪛 Ruff (0.12.2)

benchmarks/utils/benchmark.py

25-27: Avoid specifying long messages outside the exception class

(TRY003)

31-33: Avoid specifying long messages outside the exception class

(TRY003)

🪛 markdownlint-cli2 (0.17.2)

benchmarks/incluster/README.md

42-42: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

61-61: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

🪛 Checkov (3.2.334)

benchmarks/incluster/benchmark_job.yaml

[medium] 4-58: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)

[medium] 4-58: Minimize the admission of root containers

(CKV_K8S_23)

🪛 YAMLlint (1.37.1)

benchmarks/incluster/benchmark_job.yaml

[error] 58-58: no new line character at the end of file

(new-line-at-end-of-file)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Build and Test - dynamo

🔇 Additional comments (2)

benchmarks/utils/workflow.py (1)

48-55: Signature verified — no action required.
run_concurrency_sweep is defined in benchmarks/utils/genai.py:104 as (service_url, model_name, isl, osl, stddev, output_dir) and the call at benchmarks/utils/workflow.py:48 uses matching keyword args.

benchmarks/incluster/benchmark_job.yaml (1)

57-57: Verify TTL controller availability before using ttlSecondsAfterFinished

kubectl was unavailable in the verification environment (kubectl: command not found). Run one of these on a cluster with kubectl access to confirm TTL support:

kubectl api-resources | grep -i 'ttlSecondsAfterFinished'

kubectl api-resources | grep -i ttl

If the TTL resource/controller is not present, remove or guard the ttlSecondsAfterFinished field in benchmarks/incluster/benchmark_job.yaml (line 57) or enable the TTL controller on the cluster.

benchmarks/incluster/benchmark_job.yaml

benchmarks/utils/benchmark.py

Signed-off-by: Hannah Zhang <[email protected]>

docs/benchmarks/benchmarking.md

benchmarks/incluster/benchmark_job.yaml

docs/benchmarks/benchmarking.md

benchmarks/incluster/benchmark_job.yaml

docs/benchmarks/benchmarking.md

biswapanda

left few comments

Signed-off-by: Hannah Zhang <[email protected]>

…-allow-in-cluster-perf-benchmarks-kubectl-one-liner

biswapanda · 2025-09-22T22:38:21Z

Commit history seems to be messed up

docs/benchmarks/benchmarking.md

Signed-off-by: Hannah Zhang <[email protected]>

…arks-kubectl-one-liner

biswapanda · 2025-09-23T00:40:01Z

lgtm.
please address Itay's concerns docs around server/client side distinctions may be summary/images might help.

biswapanda

lgtm

Signed-off-by: Hannah Zhang <[email protected]>

…-allow-in-cluster-perf-benchmarks-kubectl-one-liner

hhzhang16 · 2025-09-23T16:25:17Z

/ok to test de853cf

Signed-off-by: Hannah Zhang <[email protected]> Signed-off-by: zhongdaor <[email protected]> Signed-off-by: tmontfort <[email protected]> Signed-off-by: Keiven Chang <[email protected]> Signed-off-by: Jacky <[email protected]> Signed-off-by: richardhuo-nv <[email protected]> Signed-off-by: Tushar Sharma <[email protected]> Signed-off-by: [email protected] <[email protected]> Signed-off-by: Krishnan Prashanth <[email protected]> Signed-off-by: Olga Andreeva <[email protected]> Signed-off-by: oandreeva-nv <[email protected]> Co-authored-by: zhongdaor-nv <[email protected]> Co-authored-by: Thomas Montfort <[email protected]> Co-authored-by: Keiven C <[email protected]> Co-authored-by: Jacky <[email protected]> Co-authored-by: Richard Huo <[email protected]> Co-authored-by: Tushar Sharma <[email protected]> Co-authored-by: Tzu-Ling Kan <[email protected]> Co-authored-by: KrishnanPrash <[email protected]> Co-authored-by: Olga Andreeva <[email protected]> Co-authored-by: Ziqi Fan <[email protected]> Co-authored-by: oandreeva-nv <[email protected]> Signed-off-by: Jason Zhou <[email protected]>

Signed-off-by: Hannah Zhang <[email protected]> Signed-off-by: zhongdaor <[email protected]> Signed-off-by: tmontfort <[email protected]> Signed-off-by: Keiven Chang <[email protected]> Signed-off-by: Jacky <[email protected]> Signed-off-by: richardhuo-nv <[email protected]> Signed-off-by: Tushar Sharma <[email protected]> Signed-off-by: [email protected] <[email protected]> Signed-off-by: Krishnan Prashanth <[email protected]> Signed-off-by: Olga Andreeva <[email protected]> Signed-off-by: oandreeva-nv <[email protected]> Co-authored-by: zhongdaor-nv <[email protected]> Co-authored-by: Thomas Montfort <[email protected]> Co-authored-by: Keiven C <[email protected]> Co-authored-by: Jacky <[email protected]> Co-authored-by: Richard Huo <[email protected]> Co-authored-by: Tushar Sharma <[email protected]> Co-authored-by: Tzu-Ling Kan <[email protected]> Co-authored-by: KrishnanPrash <[email protected]> Co-authored-by: Olga Andreeva <[email protected]> Co-authored-by: Ziqi Fan <[email protected]> Co-authored-by: oandreeva-nv <[email protected]>

Signed-off-by: Hannah Zhang <[email protected]> Signed-off-by: zhongdaor <[email protected]> Signed-off-by: tmontfort <[email protected]> Signed-off-by: Keiven Chang <[email protected]> Signed-off-by: Jacky <[email protected]> Signed-off-by: richardhuo-nv <[email protected]> Signed-off-by: Tushar Sharma <[email protected]> Signed-off-by: [email protected] <[email protected]> Signed-off-by: Krishnan Prashanth <[email protected]> Signed-off-by: Olga Andreeva <[email protected]> Signed-off-by: oandreeva-nv <[email protected]> Co-authored-by: zhongdaor-nv <[email protected]> Co-authored-by: Thomas Montfort <[email protected]> Co-authored-by: Keiven C <[email protected]> Co-authored-by: Jacky <[email protected]> Co-authored-by: Richard Huo <[email protected]> Co-authored-by: Tushar Sharma <[email protected]> Co-authored-by: Tzu-Ling Kan <[email protected]> Co-authored-by: KrishnanPrash <[email protected]> Co-authored-by: Olga Andreeva <[email protected]> Co-authored-by: Ziqi Fan <[email protected]> Co-authored-by: oandreeva-nv <[email protected]> Signed-off-by: Jason Zhou <[email protected]>

Signed-off-by: Hannah Zhang <[email protected]> Signed-off-by: zhongdaor <[email protected]> Signed-off-by: tmontfort <[email protected]> Signed-off-by: Keiven Chang <[email protected]> Signed-off-by: Jacky <[email protected]> Signed-off-by: richardhuo-nv <[email protected]> Signed-off-by: Tushar Sharma <[email protected]> Signed-off-by: [email protected] <[email protected]> Signed-off-by: Krishnan Prashanth <[email protected]> Signed-off-by: Olga Andreeva <[email protected]> Signed-off-by: oandreeva-nv <[email protected]> Co-authored-by: zhongdaor-nv <[email protected]> Co-authored-by: Thomas Montfort <[email protected]> Co-authored-by: Keiven C <[email protected]> Co-authored-by: Jacky <[email protected]> Co-authored-by: Richard Huo <[email protected]> Co-authored-by: Tushar Sharma <[email protected]> Co-authored-by: Tzu-Ling Kan <[email protected]> Co-authored-by: KrishnanPrash <[email protected]> Co-authored-by: Olga Andreeva <[email protected]> Co-authored-by: Ziqi Fan <[email protected]> Co-authored-by: oandreeva-nv <[email protected]>

Signed-off-by: Hannah Zhang <[email protected]> Signed-off-by: zhongdaor <[email protected]> Signed-off-by: tmontfort <[email protected]> Signed-off-by: Keiven Chang <[email protected]> Signed-off-by: Jacky <[email protected]> Signed-off-by: richardhuo-nv <[email protected]> Signed-off-by: Tushar Sharma <[email protected]> Signed-off-by: [email protected] <[email protected]> Signed-off-by: Krishnan Prashanth <[email protected]> Signed-off-by: Olga Andreeva <[email protected]> Signed-off-by: oandreeva-nv <[email protected]> Co-authored-by: zhongdaor-nv <[email protected]> Co-authored-by: Thomas Montfort <[email protected]> Co-authored-by: Keiven C <[email protected]> Co-authored-by: Jacky <[email protected]> Co-authored-by: Richard Huo <[email protected]> Co-authored-by: Tushar Sharma <[email protected]> Co-authored-by: Tzu-Ling Kan <[email protected]> Co-authored-by: KrishnanPrash <[email protected]> Co-authored-by: Olga Andreeva <[email protected]> Co-authored-by: Ziqi Fan <[email protected]> Co-authored-by: oandreeva-nv <[email protected]> Signed-off-by: Jason Zhou <[email protected]>

Signed-off-by: Hannah Zhang <[email protected]> Signed-off-by: zhongdaor <[email protected]> Signed-off-by: tmontfort <[email protected]> Signed-off-by: Keiven Chang <[email protected]> Signed-off-by: Jacky <[email protected]> Signed-off-by: richardhuo-nv <[email protected]> Signed-off-by: Tushar Sharma <[email protected]> Signed-off-by: [email protected] <[email protected]> Signed-off-by: Krishnan Prashanth <[email protected]> Signed-off-by: Olga Andreeva <[email protected]> Signed-off-by: oandreeva-nv <[email protected]> Co-authored-by: zhongdaor-nv <[email protected]> Co-authored-by: Thomas Montfort <[email protected]> Co-authored-by: Keiven C <[email protected]> Co-authored-by: Jacky <[email protected]> Co-authored-by: Richard Huo <[email protected]> Co-authored-by: Tushar Sharma <[email protected]> Co-authored-by: Tzu-Ling Kan <[email protected]> Co-authored-by: KrishnanPrash <[email protected]> Co-authored-by: Olga Andreeva <[email protected]> Co-authored-by: Ziqi Fan <[email protected]> Co-authored-by: oandreeva-nv <[email protected]>

Signed-off-by: Hannah Zhang <[email protected]> Signed-off-by: zhongdaor <[email protected]> Signed-off-by: tmontfort <[email protected]> Signed-off-by: Keiven Chang <[email protected]> Signed-off-by: Jacky <[email protected]> Signed-off-by: richardhuo-nv <[email protected]> Signed-off-by: Tushar Sharma <[email protected]> Signed-off-by: [email protected] <[email protected]> Signed-off-by: Krishnan Prashanth <[email protected]> Signed-off-by: Olga Andreeva <[email protected]> Signed-off-by: oandreeva-nv <[email protected]> Co-authored-by: zhongdaor-nv <[email protected]> Co-authored-by: Thomas Montfort <[email protected]> Co-authored-by: Keiven C <[email protected]> Co-authored-by: Jacky <[email protected]> Co-authored-by: Richard Huo <[email protected]> Co-authored-by: Tushar Sharma <[email protected]> Co-authored-by: Tzu-Ling Kan <[email protected]> Co-authored-by: KrishnanPrash <[email protected]> Co-authored-by: Olga Andreeva <[email protected]> Co-authored-by: Ziqi Fan <[email protected]> Co-authored-by: oandreeva-nv <[email protected]> Signed-off-by: Kyle H <[email protected]>

hhzhang16 and others added 16 commits September 8, 2025 15:39

feat: initial benchmarking wrapper in-cluster work

558482d

Signed-off-by: Hannah Zhang <[email protected]>

Merge branch 'main' of github.com:ai-dynamo/dynamo into hannahz/dyn-973…

7cc6edb

…-allow-in-cluster-perf-benchmarks-kubectl-one-liner Signed-off-by: Hannah Zhang <[email protected]>

feat: update benchmark job for in-cluster benchmarking following late…

5a09233

…st main Signed-off-by: Hannah Zhang <[email protected]>

feat: update in-cluster benchmark job and yaml

8f19a4d

Signed-off-by: Hannah Zhang <[email protected]>

feat: enhance GPT OSS frontend with improved harmony tool calling par…

3ff6675

…ser and reasoning parser (#2999) Signed-off-by: zhongdaor <[email protected]>

feat(operator): mechanism for disabling imagePullSecrets discovery (#…

9482320

…3092) Signed-off-by: tmontfort <[email protected]>

refactor: simplify Dockerfile.vllm, enable local-dev for all framewor…

f7cc9e9

…ks (#3099) Signed-off-by: Keiven Chang <[email protected]>

feat: Request Cancellation unary request support (#3004)

d5f0495

Signed-off-by: Jacky <[email protected]>

build: update trtllm to v1.1.0rc5 to enable trtllm + KVBM integration (…

1648836

…#3119) Signed-off-by: richardhuo-nv <[email protected]>

build: OPS-597, OPS-861 restructure TRT-LLM to follow container strat…

91181f6

…egy structure + add gating tests for public CI (#3009) Signed-off-by: Tushar Sharma <[email protected]>

feat: Sglang canary health check (#3103)

89e074c

Signed-off-by: [email protected] <[email protected]>

feat: Convert message[content] from list to string. (#3067)

271ef47

Signed-off-by: Krishnan Prashanth <[email protected]>

feat: update READMe commands

8ee077f

Signed-off-by: Hannah Zhang <[email protected]>

feat: update READMe commands

4ac8147

Signed-off-by: Hannah Zhang <[email protected]>

Merge branch 'main' of github.com:ai-dynamo/dynamo into hannahz/dyn-973…

e7ed272

…-allow-in-cluster-perf-benchmarks-kubectl-one-liner

hhzhang16 requested review from a team, atchernych, biswapanda, hutm, ishandhanani, julienmancuso, mohammedabdulwahhab and nnshah1 as code owners September 19, 2025 18:13

pull-request-size bot added the size/L label Sep 19, 2025

github-actions bot added the feat label Sep 19, 2025

coderabbitai bot reviewed Sep 19, 2025

View reviewed changes

benchmarks/incluster/benchmark_job.yaml Show resolved Hide resolved

benchmarks/utils/benchmark.py Outdated Show resolved Hide resolved

hhzhang16 added 2 commits September 22, 2025 10:20

docs: have user modify benchmark job instead of using envsubst

69bcfa8

Signed-off-by: Hannah Zhang <[email protected]>

docs: add tldr

e83590b

Signed-off-by: Hannah Zhang <[email protected]>