Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
113 commits
Select commit Hold shift + click to select a range
ac7e888
docs: fix helm chart urls (#2033)
nealvaidya Jul 21, 2025
76fd471
refactor: support for turning prefix cache off (#2034)
alec-flowers Jul 22, 2025
4449f3d
fix: never sleep on the eos (#2039)
alec-flowers Jul 22, 2025
20c5daf
fix: install torch distribution matching container cuda version (#2027)
ptarasiewiczNV Jul 22, 2025
e5a8628
feat: add a hierarchical Prometheus MetricsRegistry trait for Distrib…
keivenchang Jul 22, 2025
7882693
feat: use atomic transactions when creating etcd kv (#2044)
PeaBrane Jul 22, 2025
d65ce1b
chore(sglang): Move examples/sglang to components/backends/sglang (#2…
grahamking Jul 22, 2025
73505c7
fix: correct Nixl plugin paths in Dockerfile. (#2048)
karya0 Jul 22, 2025
c49a13e
docs: Cleanup index.rst (#2007)
atchernych Jul 22, 2025
9f2356c
chore: Remove unused portion of kv bindings test (#2052)
rmccorm4 Jul 22, 2025
f3e3d94
refactor: vLLM to new Python UX (#1983)
alec-flowers Jul 22, 2025
9cfaa7b
chore: Bump genai-perf to v0.0.15 (#2051)
ptarasiewiczNV Jul 22, 2025
22e6c96
chore: Change vllm K8s from dynamo-run to python -m dynamo.frontend (…
grahamking Jul 22, 2025
b127d95
feat: health check changes based on endpoint served (#1996)
nnshah1 Jul 23, 2025
1958b3a
build: Fixes for vLLM Blackwell Builds (#2020)
zaristei Jul 23, 2025
2c642fd
fix: vllm deployment examples (#2062)
biswapanda Jul 23, 2025
6a69ef4
fix: cryptic error message for empty messages list in /chat/completio…
heisenberglit Jul 23, 2025
c6f12f6
ci: Add RUN_SGLANG to CI variables (#1928)
pvijayakrish Jul 23, 2025
e0a5194
feat: Connect Library (#1478)
whoisj Jul 23, 2025
ffb5409
fix: endpoint changes should be prioritized over new requests in kv s…
PeaBrane Jul 23, 2025
eebc741
docs: Adjust the path to examples (#2056)
atchernych Jul 23, 2025
f9b1757
fix: Bring back ignore_eos/min_tokens support in trtllm component (#2…
rmccorm4 Jul 23, 2025
66b7d2c
fix: updates versions and adds ahashmap to BPE (#2072)
paulhendricks Jul 23, 2025
9bdceac
fix: github ci triggers (#2075)
biswapanda Jul 23, 2025
7a0013b
chore: update attributions for 0.3.2 release (#1837) (#2032)
nv-anants Jul 23, 2025
13560ab
feat: sglang examples launch and deploy (#2068)
biswapanda Jul 23, 2025
f3d784f
feat: query instance_id based on routing strategy (#1787)
biswapanda Jul 23, 2025
3c500ae
docs: Update docs for new UX (#2070)
grahamking Jul 23, 2025
19a77ae
chore(dynamo-run): Remove out=sglang|vllm|trtllm (#1920)
grahamking Jul 24, 2025
ee3a8e4
feat: add initial Grove support (#2012)
julienmancuso Jul 24, 2025
cde8db3
docs: Replace a sym link with and actual markdown link (#2074)
atchernych Jul 24, 2025
13d3cc1
feat: add nixl benchmark deployment instructions (#2060)
biswapanda Jul 24, 2025
2fc65ad
feat: dump radix tree as router events (#2057)
PeaBrane Jul 24, 2025
ba3ac23
test: add router e2e test with mockers to per-merge ci (#2073)
PeaBrane Jul 24, 2025
fe718fd
feat: deploy SLA profiler to k8s (#2030)
hhzhang16 Jul 24, 2025
a2874fd
feat: add possibility to use grove in dynamo graph helm chart (#1954)
julienmancuso Jul 24, 2025
f03f8be
docs: hello_world python binding example (#2083)
nealvaidya Jul 24, 2025
2bbbd44
chore: Remove unused trtllm requirements.txt (#2098)
rmccorm4 Jul 24, 2025
f0e382a
fix: Merge env vars correctly (#2096)
julienmancuso Jul 24, 2025
3094278
docs: Create a guide for writing dynamo deployments CR (#1999)
atchernych Jul 24, 2025
ff92053
docs: add NAMESPACE (#2105)
atchernych Jul 25, 2025
a2cb1c3
feat: update python packaging for new dynamo UX (#2054)
grahamking Jul 25, 2025
24cb926
docs: Clean index.rst (#2104)
atchernych Jul 25, 2025
412a12a
fix: rm enforce eager from vllm deploy - prefer perf over pod launch …
biswapanda Jul 25, 2025
2cd96ec
build: Add TensorRT-LLM to optional dependency and corresponding inst…
tanmayv25 Jul 25, 2025
384e449
fix: agg router test (#2123)
alec-flowers Jul 25, 2025
4dc529a
chore: remove vLLM v0 multimodal example (#2099)
GuanLuo Jul 25, 2025
4498a77
fix: move docker-compose.yml to deploy/, and update frontend port (#2…
keivenchang Jul 25, 2025
222245e
refactor: Move engine and publisher from dynamo.llm.tensorrt_llm to d…
tanmayv25 Jul 26, 2025
b8461b6
chore: updated health checks to use new probes (#2124)
nnshah1 Jul 27, 2025
e2a514b
fix: remove prints (#2142)
alec-flowers Jul 28, 2025
615580d
feat: Base metrics: add generic ingress handler metrics (#2090)
keivenchang Jul 28, 2025
e82bc4e
chore: update vLLM to 0.10.0 (#2114)
ptarasiewiczNV Jul 28, 2025
803bfa8
feat: proper local hashes for mockers + router watches endpoints (#2132)
PeaBrane Jul 28, 2025
0cb01b3
feat: updates to structured logging (#2061)
nnshah1 Jul 28, 2025
ca0035f
fix: copy whole workspace for pre-merge vllm tests (#2146)
nv-anants Jul 28, 2025
d23d48b
feat: Deploy SLA planner to Kubernetes (#2135)
hhzhang16 Jul 28, 2025
708d7c3
docs: add Llama4 eagle3 one model example and configs (#2087)
jhaotingc Jul 28, 2025
096d117
docs: update router docs (#2148)
PeaBrane Jul 28, 2025
1e6709d
feat: allow to override any podSpec property (#2116)
julienmancuso Jul 28, 2025
f809659
docs: hello world deploy example (#2102)
atchernych Jul 28, 2025
cfc6178
feat: add sglang disagg deployment examples (#2137)
biswapanda Jul 28, 2025
bbe8dbb
fix: remove containers from required property of extraPodSpec (#2153)
julienmancuso Jul 28, 2025
fdcf611
chore: Add Request Migration docs and minor enhancements (#2038)
kthui Jul 28, 2025
095ea3e
chore: updating and removing tests (#2130)
nnshah1 Jul 29, 2025
4747790
feat: deprecate sdk as dependency (#2149)
biswapanda Jul 29, 2025
3175b10
docs: Update to README.md (#2141)
athreesh Jul 29, 2025
7fbd43a
docs: Update dynamo_glossary.md (#2082)
athreesh Jul 29, 2025
358e908
docs: Adding document for running Dynamo on Azure Kubernetes Services…
saurabh-nvidia Jul 29, 2025
195c4c4
docs: Quickstart with new UX (#2005)
nealvaidya Jul 29, 2025
291df28
docs: add disagg example + explanation (#2086)
nealvaidya Jul 29, 2025
ca5b681
docs: add multinode example (#2155)
nealvaidya Jul 29, 2025
a8cb655
docs: update readme install instructions (#2170)
nv-anants Jul 29, 2025
5be23eb
Readmes + eks additions (#2157)
athreesh Jul 29, 2025
2befa38
feat: claim support for AL2023 x86_64 (#2150)
saturley-hall Jul 29, 2025
e542f00
chore: cleanup examples codeowners (#2171)
nealvaidya Jul 29, 2025
12a7b83
docs: Examples README/restructuring, framework READMEs, EKS examples …
athreesh Jul 29, 2025
8b0a035
docs: Update the operator docs (#2172)
atchernych Jul 29, 2025
8248a11
feat: gaie helm chart based example (#2168)
biswapanda Jul 29, 2025
157714a
chore: add instructions to modify SLA to profile_sla doc; update comp…
tedzhouhk Jul 29, 2025
30d4612
fix: install rdma libs in runtime image. (#2163)
karya0 Jul 29, 2025
da0c572
chore: update sgl version and fix h100 wideep example (#2169)
ishandhanani Jul 30, 2025
4c90b1b
chore: Version bump to 0.4.0 (#2179)
dmitry-tokarev-nv Jul 30, 2025
ee09de0
fix: link to point to bindings/python/README.md (#2186)
keivenchang Jul 30, 2025
dabfea3
chore: address QA broken links comments (#2184)
athreesh Jul 30, 2025
b69c507
fix: add better port logic (#2175)
alec-flowers Jul 30, 2025
7fc94da
fix(container): update sgl dockerfile install commands (#2194)
ishandhanani Jul 30, 2025
57482dc
docs: Bug 5424387 (#2196)
atchernych Jul 30, 2025
f3868b1
fix: support config without resource limit for profile sla script (#2…
tedzhouhk Jul 31, 2025
f8b0a5a
feat: Add trtllm deploy examples for k8s (#2133)
tanmayv25 Jul 31, 2025
62c7898
fix: add curl and jq for health checks (#2203)
biswapanda Jul 31, 2025
c546b63
fix: update SGLang version in instructions and Dockerfile to revert t…
ishandhanani Jul 31, 2025
97390ac
fix(k8s): sglang disagg now uses decode worker (#2206)
ishandhanani Jul 31, 2025
f10aab3
fix: Migrating trtllm examples from `1.0.0rc0` to `1.0.4rc4` (#2217)
KrishnanPrash Jul 31, 2025
3bf22bb
feat: reorganize sglang and add expert distribution endpoints (#2181)
ishandhanani Jul 31, 2025
bae25dc
feat: skip downloading model weights if using mocker (only tokenizer)…
PeaBrane Jul 31, 2025
cbc0e20
fix: fix endpoint run to return error DIS-325 (#2156)
keivenchang Jul 31, 2025
625578c
chore: update nixl version to 0.4.1 (#2221)
nv-anants Jul 31, 2025
7e3b3fa
fix: Add default configs in LLMAPI. Fixes OOM issues (#2198)
tanmayv25 Jul 31, 2025
f10e44c
fix: Integration tests fixes (#2161)
keivenchang Jul 31, 2025
f14f59c
chore: Remove multimodal readme. (#2212)
krishung5 Jul 31, 2025
dbd33df
fix: handle groveTerminationDelay and auto-detect grove installation …
julienmancuso Aug 1, 2025
66231cf
feat: reduce / revert routing overheads, do not consider output token…
PeaBrane Aug 1, 2025
8c75ed7
fix: frontend metrics to be renamed from nv_llm_http_service_* => dyn…
keivenchang Aug 1, 2025
1ad6abe
feat: add sgl deploy readme (#2238)
ishandhanani Aug 1, 2025
efd863d
fix: dynamo_component to be added in metric names (#2180)
keivenchang Aug 1, 2025
faafa5f
docs: add a docs/guides/metrics.md (#2160)
keivenchang Aug 1, 2025
cb1492a
rebase main
ziqifan617 Aug 1, 2025
ae51b3f
test: Request Migration Docs and E2E vLLM Tests (#2177)
kthui Aug 1, 2025
959f810
feat: sglang + gb200 (#2223)
ishandhanani Aug 1, 2025
fa492bb
docs: Dyn 591 (#2247)
atchernych Aug 2, 2025
357f34b
cleanup (#2250)
ziqifan617 Aug 2, 2025
2954005
Merge branch 'main' into ziqi/connector-250801
ziqifan617 Aug 2, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
docs: add a docs/guides/metrics.md (#2160)
Co-authored-by: Keiven Chang <[email protected]>
  • Loading branch information
keivenchang and keivenchang authored Aug 1, 2025
commit faafa5ffe67c154597e8c68fbc565d68aedb2da2
297 changes: 276 additions & 21 deletions deploy/metrics/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,17 @@

This directory contains configuration for visualizing metrics from the metrics aggregation service using Prometheus and Grafana.

## Components
> [!NOTE]
> For detailed information about Dynamo's metrics system, including hierarchical metrics, automatic labeling, and usage examples, see the [Metrics Guide](../../docs/guides/metrics.md).

## Overview

### Components

- **Prometheus Server**: Collects and stores metrics from Dynamo services and other components.
- **Grafana**: Provides dashboards by querying the Prometheus Server.

## Topology
### Topology

Default Service Relationship Diagram:
```mermaid
Expand All @@ -29,17 +34,63 @@ The dcgm-exporter service in the Docker Compose network is configured to use por

As of Q2 2025, Dynamo HTTP Frontend metrics are exposed when you build containers with `--framework VLLM` or `--framework TENSORRTLLM`.

### Available Metrics

#### Component Metrics

The core Dynamo backend system automatically exposes metrics with the `dynamo_component_*` prefix for all components that use the `DistributedRuntime` framework:

- `dynamo_component_concurrent_requests`: Requests currently being processed (gauge)
- `dynamo_component_request_bytes_total`: Total bytes received in requests (counter)
- `dynamo_component_request_duration_seconds`: Request processing time (histogram)
- `dynamo_component_requests_total`: Total requests processed (counter)
- `dynamo_component_response_bytes_total`: Total bytes sent in responses (counter)
- `dynamo_component_system_uptime_seconds`: DistributedRuntime uptime (gauge)

#### Specialized Component Metrics

Some components expose additional metrics specific to their functionality:

- `dynamo_preprocessor_*`: Metrics specific to preprocessor components

#### Frontend Metrics

When using Dynamo HTTP Frontend (`--framework VLLM` or `--framework TENSORRTLLM`), these metrics are automatically exposed with the `dynamo_frontend_*` prefix and include `model` labels containing the model name:

- `dynamo_frontend_inflight_requests`: Inflight requests (gauge)
- `dynamo_frontend_input_sequence_tokens`: Input sequence length (histogram)
- `dynamo_frontend_inter_token_latency_seconds`: Inter-token latency (histogram)
- `dynamo_frontend_output_sequence_tokens`: Output sequence length (histogram)
- `dynamo_frontend_request_duration_seconds`: LLM request duration (histogram)
- `dynamo_frontend_requests_total`: Total LLM requests (counter)
- `dynamo_frontend_time_to_first_token_seconds`: Time to first token (histogram)

### Required Files

The following configuration files should be present in this directory:
- [docker-compose.yml](./docker-compose.yml): Defines the Prometheus and Grafana services
- [prometheus.yml](./prometheus.yml): Contains Prometheus scraping configuration
- [grafana-datasources.yml](./grafana-datasources.yml): Contains Grafana datasource configuration
- [grafana_dashboards/grafana-dashboard-providers.yml](./grafana_dashboards/grafana-dashboard-providers.yml): Contains Grafana dashboard provider configuration
- [grafana_dashboards/grafana-dynamo-dashboard.json](./grafana_dashboards/grafana-dynamo-dashboard.json): A general Dynamo Dashboard for both SW and HW metrics.
- [grafana_dashboards/grafana-dcgm-metrics.json](./grafana_dashboards/grafana-dcgm-metrics.json): Contains Grafana dashboard configuration for DCGM GPU metrics
- [grafana_dashboards/grafana-llm-metrics.json](./grafana_dashboards/grafana-llm-metrics.json): This file, which is being phased out, contains the Grafana dashboard configuration for LLM-specific metrics. It requires an additional `metrics` component to operate concurrently. A new version is under development.

## Getting Started

### Prerequisites

1. Make sure Docker and Docker Compose are installed on your system

2. Start Dynamo dependencies. Assume you're at the root dynamo path:
### Quick Start

1. Start Dynamo dependencies. Assume you're at the root dynamo path:

```bash
# Start the basic services (etcd & natsd), along with Prometheus and Grafana
docker compose -f deploy/docker-compose.yml --profile metrics up -d

# Minimum components for Dynamo: etcd/nats/dcgm-exporter
# Minimum components for Dynamo (will not have Prometheus and Grafana): etcd/nats/dcgm-exporter
docker compose -f deploy/docker-compose.yml up -d
```

Expand All @@ -48,24 +99,22 @@ As of Q2 2025, Dynamo HTTP Frontend metrics are exposed when you build container
export CUDA_VISIBLE_DEVICES=0,2
```

3. Web servers started. The ones that end in /metrics are in Prometheus format:
2. Web servers started. The ones that end in /metrics are in Prometheus format:
- Grafana: `http://localhost:3001` (default login: dynamo/dynamo)
- Prometheus Server: `http://localhost:9090`
- NATS Server: `http://localhost:8222` (monitoring endpoints: /varz, /healthz, etc.)
- NATS Prometheus Exporter: `http://localhost:7777/metrics`
- etcd Server: `http://localhost:2379/metrics`
- DCGM Exporter: `http://localhost:9401/metrics`

4. Optionally, if you want to experiment further, look through components/metrics/README.md for more details on launching a metrics server (subscribes to nats), mock_worker (publishes to nats), and real workers.

- Start the [components/metrics](../../components/metrics/README.md) application to begin monitoring for metric events from dynamo workers and aggregating them on a Prometheus metrics endpoint: `http://localhost:9091/metrics`.
- Uncomment the appropriate lines in prometheus.yml to poll port 9091.
- Start worker(s) that publishes KV Cache metrics: [lib/runtime/examples/service_metrics/README.md](../../lib/runtime/examples/service_metrics/README.md) can populate dummy KV Cache metrics.

### Configuration

## Configuration

### Prometheus
#### Prometheus

The Prometheus configuration is specified in [prometheus.yml](./prometheus.yml). This file is set up to collect metrics from the metrics aggregation service endpoint.

Expand All @@ -77,29 +126,233 @@ After making changes to prometheus.yml, it is necessary to reload the configurat
docker compose -f deploy/docker-compose.yml up prometheus -d --force-recreate
```

### Grafana
#### Grafana

Grafana is pre-configured with:
- Prometheus datasource
- Sample dashboard for visualizing service metrics
![grafana image](./grafana-dynamo-composite.png)

## Required Files
### Troubleshooting

The following configuration files should be present in this directory:
- [docker-compose.yml](./docker-compose.yml): Defines the Prometheus and Grafana services
- [prometheus.yml](./prometheus.yml): Contains Prometheus scraping configuration
- [grafana-datasources.yml](./grafana-datasources.yml): Contains Grafana datasource configuration
- [grafana_dashboards/grafana-dashboard-providers.yml](./grafana_dashboards/grafana-dashboard-providers.yml): Contains Grafana dashboard provider configuration
- [grafana_dashboards/grafana-dynamo-dashboard.json](./grafana_dashboards/grafana-dynamo-dashboard.json): A general Dynamo Dashboard for both SW and HW metrics.
- [grafana_dashboards/grafana-dcgm-metrics.json](./grafana_dashboards/grafana-dcgm-metrics.json): Contains Grafana dashboard configuration for DCGM GPU metrics
- [grafana_dashboards/grafana-llm-metrics.json](./grafana_dashboards/grafana-llm-metrics.json): This file, which is being phased out, contains the Grafana dashboard configuration for LLM-specific metrics. It requires an additional `metrics` component to operate concurrently. A new version is under development.
1. Verify services are running:
```bash
docker compose ps
```

## Running the deprecated `metrics` component
2. Check logs:
```bash
docker compose logs prometheus
docker compose logs grafana
```

3. For issues with the legacy metrics component (being phased out), see [components/metrics/README.md](../../components/metrics/README.md) for details on the exposed metrics and troubleshooting steps.

## Developer Guide

### Creating Metrics at Different Hierarchy Levels

#### Runtime-Level Metrics

```rust
use dynamo_runtime::DistributedRuntime;

let runtime = DistributedRuntime::new()?;
let namespace = runtime.namespace("my_namespace")?;
let component = namespace.component("my_component")?;
let endpoint = component.endpoint("my_endpoint")?;

// Create endpoint-level counters (this is a Prometheus Counter type)
let total_requests = endpoint.create_counter(
"total_requests",
"Total requests across all namespaces",
&[]
)?;

let active_connections = endpoint.create_gauge(
"active_connections",
"Number of active client connections",
&[]
)?;
```

#### Namespace-Level Metrics

```rust
let namespace = runtime.namespace("my_model")?;

// Namespace-scoped metrics
let model_requests = namespace.create_counter(
"model_requests",
"Requests for this specific model",
&[]
)?;

let model_latency = namespace.create_histogram(
"model_latency_seconds",
"Model inference latency",
&[],
&[0.001, 0.01, 0.1, 1.0, 10.0]
)?;
```

#### Component-Level Metrics

```rust
let component = namespace.component("backend")?;

// Component-specific metrics
let backend_requests = component.create_counter(
"backend_requests",
"Requests handled by this backend component",
&[]
)?;

let gpu_memory_usage = component.create_gauge(
"gpu_memory_bytes",
"GPU memory usage in bytes",
&[]
)?;
```

#### Endpoint-Level Metrics

```rust
let endpoint = component.endpoint("generate")?;

// Endpoint-specific metrics
let generate_requests = endpoint.create_counter(
"generate_requests",
"Generate endpoint requests",
&[]
)?;

let generate_latency = endpoint.create_histogram(
"generate_latency_seconds",
"Generate endpoint latency",
&[],
&[0.001, 0.01, 0.1, 1.0, 10.0]
)?;
```

### Creating Vector Metrics with Dynamic Labels

Use vector metrics when you need to track metrics with different label values:

```rust
// Counter with labels
let requests_by_model = endpoint.create_counter_vec(
"requests_by_model",
"Requests by model type",
&["model_type", "model_size"]
)?;

// Increment with specific labels
requests_by_model.with_label_values(&["llama", "7b"]).inc();
requests_by_model.with_label_values(&["gpt", "13b"]).inc();

// Gauge with labels
let memory_by_gpu = component.create_gauge_vec(
"gpu_memory_bytes",
"GPU memory usage by device",
&["gpu_id", "memory_type"]
)?;

memory_by_gpu.with_label_values(&["0", "allocated"]).set(8192.0);
memory_by_gpu.with_label_values(&["0", "cached"]).set(4096.0);
```

### Creating Histograms

Histograms are useful for measuring distributions of values like latency:

```rust
let latency_histogram = endpoint.create_histogram(
"request_latency_seconds",
"Request latency distribution",
&[],
&[0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1.0, 5.0]
)?;

// Record latency values
latency_histogram.observe(0.023); // 23ms
latency_histogram.observe(0.156); // 156ms
```

### Transitioning from Plain Prometheus

If you're currently using plain Prometheus metrics, transitioning to Dynamo's `MetricsRegistry` is straightforward:

#### Before (Plain Prometheus)

```rust
use prometheus::{Counter, Opts, Registry};

// Create a registry to hold metrics
let registry = Registry::new();
let counter_opts = Opts::new("my_counter", "My custom counter");
let counter = Counter::with_opts(counter_opts).unwrap();
registry.register(Box::new(counter.clone())).unwrap();

// Use the counter
counter.inc();

// To expose metrics, you'd need to set up an HTTP server manually
// and implement the /metrics endpoint yourself
```

#### After (Dynamo MetricsRegistry)

```rust
let counter = endpoint.create_counter(
"my_counter",
"My custom counter",
&[]
)?;

counter.inc();
```

**Note:** The metric is automatically registered when created via the endpoint's `create_counter` factory method.

**Benefits of Dynamo's approach:**
- **Automatic registration**: Metrics created via endpoint's `create_*` factory methods are automatically registered with the system
- Automatic labeling with namespace, component, and endpoint information
- Consistent metric naming with `dynamo_` prefix
- Built-in HTTP metrics endpoint when enabled with `DYN_SYSTEM_ENABLED=true`
- Hierarchical metric organization

### Advanced Features

#### Custom Buckets for Histograms

```rust
// Define custom buckets for your use case
let custom_buckets = vec![0.001, 0.01, 0.1, 1.0, 10.0];
let latency = endpoint.create_histogram(
"api_latency_seconds",
"API latency in seconds",
&[],
&custom_buckets
)?;
```

#### Metric Aggregation

```rust
// Aggregate metrics across multiple endpoints
let total_requests = namespace.create_counter(
"total_requests",
"Total requests across all endpoints",
&[]
)?;
```

## Running the deprecated `components/metrics` program

⚠️ **DEPRECATION NOTICE** ⚠️

When you run the example [components/metrics](../../components/metrics/README.md) component, it exposes a Prometheus /metrics endpoint with the following metrics (defined in [components/metrics/src/lib.rs](../../components/metrics/src/lib.rs)):
When you run the example [components/metrics](../../components/metrics/README.md) program, it exposes a Prometheus /metrics endpoint with the following metrics (defined in [components/metrics/src/lib.rs](../../components/metrics/src/lib.rs)):

**⚠️ The following `llm_kv_*` metrics are deprecated:**

Expand All @@ -123,3 +376,5 @@ When you run the example [components/metrics](../../components/metrics/README.md
docker compose logs prometheus
docker compose logs grafana
```

3. For issues with the legacy metrics component (being phased out), see [components/metrics/README.md](../../components/metrics/README.md) for details on the exposed metrics and troubleshooting steps.
Loading