Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ def parse_kv_cache_metrics(log_output: str, free_mem_ratio: float = 0.8):

# Simple patterns based on actual log format
patterns = {
"current_cache_size": r"Current cache size:\s*(\d+)",
"current_cache_size": r"Current cache size \(MB\):\s*(\d+)",
"free_mem_pre_mb": r"Free memory before forward pass \(MB\):\s*(\d+)",
"free_mem_post_mb": r"Free memory after forward pass \(MB\):\s*(\d+)",
}
Expand All @@ -89,6 +89,10 @@ def parse_kv_cache_metrics(log_output: str, free_mem_ratio: float = 0.8):
print(f" ✅ Found {metric_name}: {value}")
else:
print(f" ❌ Could not find {metric_name}")
try:
metrics["current_cache_size"] = metrics["current_cache_size"] * 1024 * 1024
except KeyError:
print(" ❌ Could not find current_cache_size")
Comment on lines +92 to +95
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Unconditional MB→bytes conversion is incorrect and duplicates “not found” logging.

Parse both formats explicitly and convert only when units are MB. Also avoid double “not found” messages.

-    try:
-        metrics["current_cache_size"] = metrics["current_cache_size"] * 1024 * 1024
-    except KeyError:
-        print("  ❌ Could not find current_cache_size")
+    # Support both "Current cache size (MB): <mb>" and "Current cache size: <bytes>"
+    m_mb = re.search(r"Current cache size\s*\(MB\):\s*([\d.]+)", log_output, re.IGNORECASE)
+    m_bytes = re.search(r"Current cache size\s*:\s*(\d+)", log_output, re.IGNORECASE)
+    if m_mb:
+        current_mb = float(m_mb.group(1))
+        metrics["current_cache_size"] = int(current_mb * 1024 * 1024)
+        print(f"  ✅ Found current_cache_size (MB): {current_mb} -> {metrics['current_cache_size']} bytes")
+    elif m_bytes:
+        metrics["current_cache_size"] = int(m_bytes.group(1))
+        print(f"  ✅ Found current_cache_size (bytes): {metrics['current_cache_size']}")
+    else:
+        print("  ❌ Could not find current_cache_size")
🤖 Prompt for AI Agents
In tests/unittest/_torch/auto_deploy/unit/singlegpu/test_ad_trtllm_bench.py
around lines 92–95, the code unconditionally multiplies
metrics["current_cache_size"] by 1024*1024 and uses an except KeyError that
causes duplicate "not found" output; instead, first check presence with
metrics.get("current_cache_size"), if missing print a single "Could not find
current_cache_size" and return; if present, handle both formats: if the value is
a string ending with 'MB' (case-insensitive) parse the numeric part and multiply
by 1024*1024 to convert to bytes; if the value is a string ending with 'B' or a
plain numeric string parse it as bytes (no conversion); if it's already an
int/float treat it as bytes (no conversion); avoid raising KeyError so no
duplicate logs.


# Calculate new_cache_size using the same formula as in resize_kv_cache
# new_cache_size = free_mem_post * 1024 * 1024 * free_mem_ratio + current_cache_size
Expand Down