feat: Add --use-ai-configurator to profile_sla.py #3079

ilyasher · 2025-09-17T01:01:40Z

Overview:

This PR adds a --use-ai-configurator flag to profile_sla.py. If this flag is passed, profile_sla.py will use aiconfigurator to estimate the model perf for the various configs it tests instead of running an actual dynamo deployment.

Advantages of --use-ai-configurator:

running profile_sla.py takes seconds instead of hours
no GPU access needed

Disadvantages:

Limited support for Models & GPU types
Estimated perf could be wrong

Example usage

python3 benchmarks/profiler/profile_sla.py --config components/backends/trtllm/deploy/disagg.yaml --backend trtllm --use-ai-configurator --aic-system h200_sxm --aic-model-name QWEN3_32B --backend-version 0.20.0

Details:

Add --use-ai-configurator flag as well as --backend-version, --aic-model-name, --aic-system
Change profile_sla.py to call aiconfigurator when --use-ai-configurator is called
Add file estimate_perf.py which contains a helper class for using aiconfigurator
Add tests for profile_sla.py + --use-ai-configurator

Where should the reviewer start?

Reviewer can read the added tests, then changes to profile_sla.py, then other files.

Summary by CodeRabbit

New Features
- Optional AI-based performance estimation for profiling without real deployments.
- Added CLI options: --use-ai-configurator, --aic-system, --aic-model-name, --backend-version, with input validation and warnings when unused.
- Prefill and decode profiling now support AI-driven estimates while preserving existing outputs and plots.
Tests
- New integration tests covering AI-configurator flows, missing/invalid arguments, and multiple backend/model versions.
- Updated dry-run tests to include the new CLI flags.

copy-pr-bot · 2025-09-17T01:01:44Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2025-09-17T01:16:06Z

Walkthrough

Introduces AI-configurator-based performance estimation into the profiling flow, adding a new estimator module, pluggable callbacks in prefill/decode utilities, CLI flags to select AI-based paths, and tests validating AI-configurator integration and argument handling. Legacy live-deployment behavior remains unchanged when the AI-configurator option is not used.

Changes

Cohort / File(s)	Summary of Changes
Profiler CLI and orchestration `benchmarks/profiler/profile_sla.py`	Adds AI-configurator mode with new CLI flags (`--use-ai-configurator`, `--aic-system`, `--aic-model-name`, `--backend-version`); routes prefill/decode to AI-based estimators when enabled; validates required inputs; skips deployment in AI path; preserves legacy path otherwise.
AI Configurator Estimator `benchmarks/profiler/utils/estimate_perf.py`	New module providing `AIConfiguratorPerfEstimator` with lazy import, database/backend resolution, and methods: `estimate_perf`, `estimate_prefill_perf`, `get_max_batch_size`, `get_max_kv_tokens`.
Decode profiling utils `benchmarks/profiler/utils/profile_decode.py`	Refactors into a helper using a callback for ITL/throughput; adds `profile_decode_aiconfigurator` and optional estimator parameter; preserves data collection/plotting; skips entries on missing metrics.
Prefill profiling utils `benchmarks/profiler/utils/profile_prefill.py`	Refactors into a helper using a TTFT callback; adds `profile_prefill_aiconfigurator`; keeps interpolation/plotting; inner `get_ttft` in AI path computes TTFT but lacks an explicit return.
Tests: AI-configurator integration `tests/profiler/test_profile_sla_aiconfigurator.py`	New tests covering missing/invalid AI-configurator args and successful runs across versions/models without GPU.
Tests: Dry-run arg parity `tests/profiler/test_profile_sla_dryrun.py`	Extends fixtures to include `use_ai_configurator`, `aic_system`, `aic_model_name`, `backend_version` defaults.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant CLI as profile_sla CLI
  participant Prefill as profile_prefill*
  participant Decode as profile_decode*
  participant Est as AIConfiguratorPerfEstimator
  participant Bench as Live Benchmark

  rect rgba(230,245,255,0.5)
    Note over CLI: --use-ai-configurator=true
    User->>CLI: run_profile(args)
    CLI->>Est: init(system, backend, version, model)
    CLI->>Prefill: profile_prefill_aiconfigurator(get_ttft via Est)
    Prefill->>Est: estimate_prefill_perf(isl)
    Est-->>Prefill: {context_latency}
    CLI->>Est: get_max_kv_tokens(isl, osl)
    CLI->>Decode: profile_decode_aiconfigurator(get_itl/throughput via Est)
    Decode->>Est: estimate_perf(isl, osl, batch)
    Est-->>Decode: {tpot, tokens/s/gpu}
    Decode-->>CLI: raw_data + plots
    Prefill-->>CLI: raw_data + plots
  end

  rect rgba(240,240,240,0.5)
    Note over CLI: --use-ai-configurator=false
    User->>CLI: run_profile(args)
    CLI->>Prefill: profile_prefill (benchmark)
    Prefill->>Bench: benchmark_prefill(isl)
    Bench-->>Prefill: TTFT
    CLI->>Decode: profile_decode (benchmark)
    Decode->>Bench: benchmark_decode(isl, osl, num_requests)
    Bench-->>Decode: ITL, throughput
    Decode-->>CLI: raw_data + plots
    Prefill-->>CLI: raw_data + plots
  end

  note right of CLI: *Helpers use callbacks to obtain metrics

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

refactor: break profile_sla into different files; feat: support vllm_v1 #1588 — Also adds AIConfigurator support in profiling paths and related utilities, touching similar modules.
feat: standalone profiling script for a given endpoint #2386 — Modifies profile_prefill/profile_decode utilities and integrates alternative estimation sources.
test: add test for pre-deployment script #2857 — Adjusts profiler CLI flow (e.g., dry-run), overlapping with current CLI additions.

Poem

In burrows of benchmarks I hop and I scheme,
Estimating tokens like a caffeinated dream.
No pods to deploy, just numbers to weigh—
TTFT, ITL, I plot all day.
With paws on the CLI, I configure with glee,
“AI, estimate!”—and off we flee. 🐇📈

Pre-merge checks

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 29.63% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title succinctly and accurately describes the primary change—adding the --use-ai-configurator CLI flag to benchmarks/profiler/profile_sla.py—and is concise, focused, and directly related to the main changes in this PR. It clearly signals the main intent to a reviewer scanning history.
Description Check	✅ Passed	The PR description follows the repository template by providing Overview, Details, and "Where should the reviewer start" sections and clearly explains the purpose, advantages/disadvantages, added CLI flags, example usage, and reviewer guidance; it also documents added files and tests for reviewers to inspect. This makes the description mostly complete and actionable for review. The only missing template element is a "Related Issues" section linking an issue or explicitly stating none.

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

benchmarks/profiler/profile_sla.py (2)

399-404: Return early if no prefill results to avoid min() on empty list

When prefill produced no results, the code logs an error but still proceeds to call min(prefill_ttft), which crashes. Exit the recommendations phase early.

Apply this diff:

         else:
             logger.info("Analyzing results and generate recommendations...")
             # Safety guards: no results → exit early with a clear message
             if not (prefill_tp_size and prefill_ttft and prefill_thpt_per_gpu):
                 logger.error("No prefill results produced; skipping recommendations.")
+                return

461-470: Guard KV-cache utilization division by zero

decode_kv_cache_size[...] can be 0 (parser returned 0), causing a crash. Add a safe path.

Apply this diff:

-            selected_decode_kv_cache_utilization = (
-                decode_concurrency[selected_decode_idx]
-                * (args.isl + (args.osl / 2))
-                / decode_kv_cache_size[selected_decode_idx]
-            )
+            denom = decode_kv_cache_size[selected_decode_idx]
+            if denom <= 0:
+                logger.warning("KV cache size is non-positive; skipping utilization recommendation.")
+                selected_decode_kv_cache_utilization = 0.0
+            else:
+                selected_decode_kv_cache_utilization = (
+                    decode_concurrency[selected_decode_idx]
+                    * (args.isl + (args.osl / 2))
+                    / denom
+                )

🧹 Nitpick comments (10)

benchmarks/profiler/utils/estimate_perf.py (1)

157-159: Document backend's private API usage

Using self.backend._get_memory_usage relies on a private method (prefixed with underscore). This could break in future aiconfigurator updates.

Consider adding a comment acknowledging this dependency:
 def get_mem_usage(bs: int):
+    # Note: Using private API _get_memory_usage from aiconfigurator backend
     return self.backend._get_memory_usage(
         model, self.database, bs, 1, isl, osl
     )["total"]

tests/profiler/test_profile_sla_aiconfigurator.py (2)

20-20: Remove unused noqa directive

The noqa: E402 directive is unnecessary as E402 is not enabled in your Ruff configuration.
-from benchmarks.profiler.profile_sla import run_profile  # noqa: E402
+from benchmarks.profiler.profile_sla import run_profile
31-31: Consider using a more secure temporary directory

Using /tmp/test_profiling_results could potentially lead to security issues in shared environments. Consider using Python's tempfile module for test directories.

You could enhance test isolation by using a unique temporary directory per test:
import tempfile

@pytest.fixture
def trtllm_args(self, tmp_path):
    class Args:
        backend = "trtllm"
        config = "components/backends/trtllm/deploy/disagg.yaml"
        output_dir = str(tmp_path / "test_profiling_results")
        # ... rest of the args
This would leverage pytest's built-in tmp_path fixture for better test isolation.

benchmarks/profiler/profile_sla.py (7)

328-335: Handle empty concurrency sweep and fix log message

Skip TP sizes where max_concurrency <= 0 (or none of the values fit) and fix the misleading log message.

Apply this diff:

             if not args.dry_run:
                 sweep_num_request = [
                     num for num in DECODE_NUM_REQUESTS_RANGE if num <= max_concurrency
                 ]
-                logger.info(
-                    f"Sweeping num_request range based on maximum number of kv tokens: {sweep_num_request}"
-                )
+                if not sweep_num_request:
+                    logger.warning(
+                        f"No feasible concurrency values under max_concurrency={max_concurrency}; skipping TP{tp_size}."
+                    )
+                    continue
+                logger.info(
+                    f"Sweeping num_request range based on computed max concurrency ({max_concurrency}): {sweep_num_request}"
+                )

392-395: Filter out empty-engine results before plotting decode

Avoid plotting with empty series to prevent downstream errors/warnings.

Apply this diff:

-        if decode_results:
-            plot_decode_performance(decode_results, args.itl, args.output_dir)
+        if decode_results:
+            nonempty_results = [
+                (tp, itl_list, thpt_list)
+                for (tp, itl_list, thpt_list) in decode_results
+                if itl_list and thpt_list
+            ]
+            if nonempty_results:
+                plot_decode_performance(nonempty_results, args.itl, args.output_dir)
+            else:
+                logger.warning("No non-empty decode results to plot.")

96-109: Consolidate flag validation (addresses Ruff TRY003/TRY301)

Reduce repetition and produce a single actionable error listing missing flags when --use-ai-configurator is set.

Apply this diff:

-        if args.use_ai_configurator:
-            if not args.aic_system:
-                raise ValueError(
-                    "Must provide --aic-system when using --use-ai-configurator."
-                )
-            if not args.aic_model_name:
-                raise ValueError(
-                    "Must provide --aic-model-name when using --use-ai-configurator."
-                )
-            if not args.backend_version:
-                raise ValueError(
-                    "Must provide --backend-version when using --use-ai-configurator."
-                )
+        if args.use_ai_configurator:
+            missing = []
+            if not args.aic_system:
+                missing.append("--aic-system")
+            if not args.aic_model_name:
+                missing.append("--aic-model-name")
+            if not args.backend_version:
+                missing.append("--backend-version")
+            if missing:
+                raise ValueError(
+                    "Missing required flags for --use-ai-configurator: " + ", ".join(missing)
+                )

112-116: Avoid forcing lowercase for --aic-system

Lowercasing may not match database keys (depends on aiconfigurator DB naming). Prefer passing the value verbatim and normalize inside the estimator if needed.

Apply this diff:

-                args.aic_system.lower(),
+                args.aic_system,

To confirm expected casing, please verify what AIConfiguratorPerfEstimator (and the underlying DB) expects for system names (e.g., h100_sxm vs H100_SXM). If normalization is needed, do it centrally in the estimator.

571-583: Decode interpolation: compute max_kv_tokens once and skip when zero

Mirror the earlier guard: bail out early if max_kv_tokens <= 0 to avoid empty plots or later errors.

Apply this diff:

         elif args.use_ai_configurator:
             max_kv_tokens = ai_configurator_perf_estimator.get_max_kv_tokens(
                 args.isl, args.osl, tp_size=best_decode_tp
             )
+            if max_kv_tokens <= 0:
+                logger.warning("Estimator returned non-positive max_kv_tokens; skipping decode interpolation.")
+                max_kv_tokens = 0
+                # Optionally return early or continue without plotting; choose one:
+                return
             profile_decode_aiconfigurator(

627-629: Surface estimator DB-not-found errors with actionable guidance

Catching ValueError from the estimator (e.g., DB not found) can produce clearer logs before re-raising.

Apply this diff:

-    except Exception as e:
+    except ValueError as e:
+        logger.error(f"AI-configurator initialization failed: {e}. "
+                     "Check --aic-system/--aic-model-name/--backend-version are supported by the local DB.")
+        raise
+    except Exception as e:
         logger.error(f"Profile job failed with error: {e}")
         raise

321-327: Skip decode TP when log parsing yields zero KV cache

If get_kv_cache_size_from_dynamo_log returns 0, max_concurrency becomes 0 and later steps may misbehave. Guard and continue.

Apply this diff:

             max_kv_tokens = config_modifier.get_kv_cache_size_from_dynamo_log(
                 f"{work_dir}/{client.deployment_name}/{WORKER_COMPONENT_NAMES[args.backend].decode_worker_k8s_name.lower()}/0.log"
             )
-            max_concurrency = max_kv_tokens // (args.isl + args.osl)
+            max_concurrency = max_kv_tokens // (args.isl + args.osl) if (args.isl + args.osl) > 0 else 0
+            if max_concurrency <= 0:
+                logger.warning("Parsed KV cache size is zero; skipping this TP size.")
+                await client.delete_deployment()
+                deployment_clients.remove(client)
+                logger.info("Deployment deleted")
+                continue

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0373b89 and d1d3ee0.

📒 Files selected for processing (6)

benchmarks/profiler/profile_sla.py (12 hunks)
benchmarks/profiler/utils/estimate_perf.py (1 hunks)
benchmarks/profiler/utils/profile_decode.py (4 hunks)
benchmarks/profiler/utils/profile_prefill.py (4 hunks)
tests/profiler/test_profile_sla_aiconfigurator.py (1 hunks)
tests/profiler/test_profile_sla_dryrun.py (3 hunks)

🧰 Additional context used

🧬 Code graph analysis (4)

tests/profiler/test_profile_sla_aiconfigurator.py (2)

benchmarks/profiler/profile_sla.py (1)

run_profile (63-634)

tests/profiler/test_profile_sla_dryrun.py (4)

trtllm_args (99-125)

Args (30-51)

Args (59-80)

Args (102-123)

benchmarks/profiler/profile_sla.py (6)

benchmarks/profiler/utils/estimate_perf.py (5)

estimate_perf (72-113)

AIConfiguratorPerfEstimator (29-210)

estimate_prefill_perf (115-135)

get_max_batch_size (137-188)

get_max_kv_tokens (190-210)

benchmarks/profiler/utils/profile_decode.py (2)

profile_decode (93-128)

profile_decode_aiconfigurator (130-155)

benchmarks/profiler/utils/profile_prefill.py (2)

profile_prefill (72-100)

profile_prefill_aiconfigurator (102-125)

benchmarks/profiler/utils/config.py (4)

get_kv_cache_size_from_dynamo_log (195-196)

get_kv_cache_size_from_dynamo_log (383-403)

get_kv_cache_size_from_dynamo_log (591-603)

get_kv_cache_size_from_dynamo_log (828-854)

deploy/utils/dynamo_deployment.py (2)

get_service_url (211-217)

delete_deployment (463-481)

benchmarks/profiler/utils/genai_perf.py (1)

benchmark_decode (189-247)

benchmarks/profiler/utils/profile_decode.py (2)

benchmarks/profiler/utils/estimate_perf.py (2)

estimate_perf (72-113)

AIConfiguratorPerfEstimator (29-210)

benchmarks/profiler/utils/genai_perf.py (1)

benchmark_decode (189-247)

benchmarks/profiler/utils/profile_prefill.py (2)

benchmarks/profiler/utils/estimate_perf.py (3)

estimate_perf (72-113)

AIConfiguratorPerfEstimator (29-210)

estimate_prefill_perf (115-135)

benchmarks/profiler/utils/genai_perf.py (1)

benchmark_prefill (154-186)

🪛 Ruff (0.12.2)

tests/profiler/test_profile_sla_aiconfigurator.py

20-20: Unused noqa directive (non-enabled: E402)

Remove unused noqa directive

(RUF100)

31-31: Probable insecure usage of temporary file or directory: "/tmp/test_profiling_results"

(S108)

benchmarks/profiler/utils/estimate_perf.py

53-55: Avoid specifying long messages outside the exception class

(TRY003)

benchmarks/profiler/profile_sla.py

98-100: Abstract raise to an inner function

(TRY301)

98-100: Avoid specifying long messages outside the exception class

(TRY003)

102-104: Abstract raise to an inner function

(TRY301)

102-104: Avoid specifying long messages outside the exception class

(TRY003)

106-108: Abstract raise to an inner function

(TRY301)

106-108: Avoid specifying long messages outside the exception class

(TRY003)

benchmarks/profiler/utils/profile_decode.py

26-26: Unused function argument: num_gpus

(ARG001)

102-102: Unused function argument: ai_configurator_perf_estimator

(ARG001)

🪛 GitHub Actions: Pre Merge Validation of (ai-dynamo/dynamo/refs/pull/3079/merge) by ilyasher.

benchmarks/profiler/utils/profile_decode.py

[error] 1-1: Ruff linting: 1 error found and auto-fixed by pre-commit; please re-run pre-commit to verify.

benchmarks/profiler/utils/profile_prefill.py

[error] 1-1: Ruff linting: 1 error found and auto-fixed by pre-commit; please re-run pre-commit to verify.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Build and Test - dynamo

🔇 Additional comments (6)

benchmarks/profiler/utils/estimate_perf.py (1)
53-55: Move exception message construction into ValueError

Following TRY003, avoid constructing long error messages outside the exception class.

Apply this fix:
 if not self.database:
     raise ValueError(
-        f"Database not found for system: {system}, backend: {backend}, version: {version}"
+        f"Database not found for system: {system}, backend: {backend}, version: {version}"
     )
Wait, the code already has the message inside the ValueError. This appears to be a false positive from the static analysis tool.
tests/profiler/test_profile_sla_dryrun.py (1)

48-51: LGTM! Consistent addition of AI configurator fields

The new fields (use_ai_configurator, aic_system, aic_model_name, backend_version) are properly initialized across all three test fixtures, maintaining consistency with the new AI configurator functionality.

Also applies to: 77-80, 120-123

tests/profiler/test_profile_sla_aiconfigurator.py (1)

87-103: LGTM! Comprehensive test coverage for AI configurator

The parameterized tests provide good coverage across multiple backend versions and model names, ensuring the AI configurator integration works with various configurations.

benchmarks/profiler/utils/profile_decode.py (2)

130-155: LGTM! Clean implementation of AI configurator integration

The profile_decode_aiconfigurator function properly delegates to the helper with a clean callback implementation that extracts the required metrics from the estimator's response.

1-156: Run pre-commit to apply Ruff auto-fixes

Pipeline reported a Ruff lint error that pre-commit auto-fixed — run pre-commit locally (e.g., pre-commit run --all-files) and commit the resulting fixes for benchmarks/profiler/utils/profile_decode.py.

benchmarks/profiler/utils/profile_prefill.py (1)

1-126: Run pre-commit or remove unused import Tuple in benchmarks/profiler/utils/profile_prefill.py

Ruff flagged an unused import — the file still contains from typing import Callable, Optional, Tuple (line 5). Run pre-commit run --all-files or remove Tuple and commit.

benchmarks/profiler/profile_sla.py

benchmarks/profiler/utils/profile_decode.py

benchmarks/profiler/utils/profile_prefill.py

benchmarks/profiler/utils/estimate_perf.py

tedzhouhk · 2025-09-17T22:03:31Z

shall we add doc to

dynamo/docs/benchmarks/pre_deployment_profiling.md

Line 4 in 78a3fed

ilyasher · 2025-09-17T23:33:34Z

@tedzhouhk I just updated the README

docs/benchmarks/pre_deployment_profiling.md

Signed-off-by: Ilya Sherstyuk <[email protected]>

Co-authored-by: Hongkuan Zhou <[email protected]> Signed-off-by: Ilya Sherstyuk <[email protected]>

Signed-off-by: Ilya Sherstyuk <[email protected]>

ilyasher · 2025-09-19T17:31:11Z

/ok to test 6b6c005

ilyasher requested review from a team, Aphoh, PeaBrane, alec-flowers, hhzhang16, jasonqinzhou, michaelshin and tedzhouhk as code owners September 17, 2025 01:01

pull-request-size bot added the size/XL label Sep 17, 2025

github-actions bot added the feat label Sep 17, 2025

coderabbitai bot reviewed Sep 17, 2025

View reviewed changes

ilyasher force-pushed the dev-isherstyuk-use-llm-pet-in-profile-sla branch from e946b01 to ce20640 Compare September 17, 2025 01:37

ilyasher requested review from ishandhanani, nnshah1, ptarasiewiczNV, richardhuo-nv, rmccorm4 and tanmayv25 as code owners September 17, 2025 21:50

tedzhouhk reviewed Sep 17, 2025

View reviewed changes

benchmarks/profiler/utils/estimate_perf.py Show resolved Hide resolved

tedzhouhk reviewed Sep 17, 2025

View reviewed changes

benchmarks/profiler/utils/estimate_perf.py Outdated Show resolved Hide resolved

tedzhouhk reviewed Sep 18, 2025

View reviewed changes

docs/benchmarks/pre_deployment_profiling.md Outdated Show resolved Hide resolved

tedzhouhk reviewed Sep 18, 2025

View reviewed changes

docs/benchmarks/pre_deployment_profiling.md Outdated Show resolved Hide resolved

tedzhouhk reviewed Sep 18, 2025

View reviewed changes

docs/benchmarks/pre_deployment_profiling.md Show resolved Hide resolved

tedzhouhk approved these changes Sep 18, 2025

View reviewed changes

feat: Add --use-ai-configurator to profile_sla.py

a71bfd4

Signed-off-by: Ilya Sherstyuk <[email protected]>

ilyasher and others added 11 commits September 18, 2025 13:58

fix: Apply coderabbit suggestions

0747082

Signed-off-by: Ilya Sherstyuk <[email protected]>

fix: Try fix tests

030bd82

Signed-off-by: Ilya Sherstyuk <[email protected]>

fix: aiconfigurator version

1aeeaff

Signed-off-by: Ilya Sherstyuk <[email protected]>

fix: Relax pydantic version

7717423

Signed-off-by: Ilya Sherstyuk <[email protected]>

fix: Update README.md

450b971

Signed-off-by: Ilya Sherstyuk <[email protected]>

fix: Reviewer comments

a1cd158

Signed-off-by: Ilya Sherstyuk <[email protected]>

fix: Add todo

e73c99d

Signed-off-by: Ilya Sherstyuk <[email protected]>

fix: resolve requirements.txt conflicts

aee06f4

Signed-off-by: Ilya Sherstyuk <[email protected]>

Update docs/benchmarks/pre_deployment_profiling.md

5b3e46d

Co-authored-by: Hongkuan Zhou <[email protected]> Signed-off-by: Ilya Sherstyuk <[email protected]>

Update docs/benchmarks/pre_deployment_profiling.md

fe4da3e

Co-authored-by: Hongkuan Zhou <[email protected]> Signed-off-by: Ilya Sherstyuk <[email protected]>

Use released aiconfigurator

1e5dd89

Signed-off-by: Ilya Sherstyuk <[email protected]>

ilyasher force-pushed the dev-isherstyuk-use-llm-pet-in-profile-sla branch from dd3cace to 1e5dd89 Compare September 18, 2025 20:58

rmccorm4 approved these changes Sep 19, 2025

View reviewed changes

Merge branch 'main' into dev-isherstyuk-use-llm-pet-in-profile-sla

6b6c005

tedzhouhk merged commit 19948b7 into main Sep 19, 2025
13 checks passed

tedzhouhk deleted the dev-isherstyuk-use-llm-pet-in-profile-sla branch September 19, 2025 18:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add --use-ai-configurator to profile_sla.py #3079

feat: Add --use-ai-configurator to profile_sla.py #3079

Uh oh!

ilyasher commented Sep 17, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

copy-pr-bot bot commented Sep 17, 2025

Uh oh!

coderabbitai bot commented Sep 17, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tedzhouhk commented Sep 17, 2025

Uh oh!

ilyasher commented Sep 17, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ilyasher commented Sep 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat: Add --use-ai-configurator to profile_sla.py #3079

feat: Add --use-ai-configurator to profile_sla.py #3079

Uh oh!

Conversation

ilyasher commented Sep 17, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Where should the reviewer start?

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Sep 17, 2025

Uh oh!

coderabbitai bot commented Sep 17, 2025

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Pre-merge checks

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tedzhouhk commented Sep 17, 2025

Uh oh!

ilyasher commented Sep 17, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ilyasher commented Sep 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ilyasher commented Sep 17, 2025 •

edited by coderabbitai bot

Loading