feat: add audit logging for chat completions #3062

ryan-lempka · 2025-09-16T17:56:06Z

Overview:

Enables the ability to capture the request/response and write to stderr. To use this feature set env var DYN_AUDIT_ENABLED=1 and store=true in the request.

Enables compliance, distillation, evaluation, and analytics use cases.

This PR begins with stderr as the target but is designed to support the addition of additional targets such as a persistent stream. The stream can be monitored by an independent process that can consume and persist the audit data in a database.

Details:

Environment controls: DYN_AUDIT_ENABLED=1 enables auditing, DYN_AUDIT_CAPACITY sets buffer size
Universal support: Works for streaming/non-streaming, tool-calling, reasoning content
Off-hot-path: Bus + sink worker architecture with async I/O to stderr (extensible to other targets)
Transport-agnostic: Single integration point in preprocessor works across HTTP/gRPC

Architecture: Audit happens post-transform for consistency with client output. Broadcast bus enables fan-out without blocking requests.

Example outputs:

//  (store=true) in request and `DYN_AUDIT_ENABLED=1`
{"schema_version":1,"request_id":"d8e62fae-7107-4567-993f-79d5f9757b1b","requested_streaming":true,"mode":"full","model":"deepseek-ai/DeepSeek-R1-Distill-Llama-8B","usage":null,"request":{"messages":[{"role":"user","content":"What is 10 + 10?"}],"model":"deepseek-ai/DeepSeek-R1-Distill-Llama-8B","store":true,"max_tokens":1000,"stream":true},"response":{"id":"chatcmpl-d8e62fae-7107-4567-993f-79d5f9757b1b","choices":[{"index":0,"message":{"content":"...","role":"assistant"},"finish_reason":"stop"}],"created":1758740068,"model":"deepseek-ai/DeepSeek-R1-Distill-Llama-8B","object":"chat.completion","usage":null}}

Where to start:

lib/llm/src/audit/ - Core system (bus, handle, sink, stream)
lib/llm/src/preprocessor.rs - Integration point
lib/llm/src/entrypoint/input.rs - Initialization

Summary by CodeRabbit

New Features
- Added an auditing subsystem for chat completions, publishing usage-only or full request/response records.
- Configurable via environment variables:
  - DYN_AUDIT_ENABLED to toggle auditing
  - DYN_AUDIT_SINKS to select outputs (default: stderr)
  - DYN_AUDIT_CAPACITY to size the event buffer
- Streaming responses are unchanged for users; records are aggregated post-stream for full audits.
- Background workers are started automatically when enabled, with readiness logging.

coderabbitai · 2025-09-16T18:02:16Z

Walkthrough

Adds a new auditing subsystem: configuration, a broadcast bus, record/handle types, sinks with worker tasks, and a streaming passthrough that aggregates chunks into a final response. Wires initialization into the input entrypoint, conditionally starting the bus and sinks based on env-driven policy and capacity.

Changes

Cohort / File(s)	Summary
Audit module scaffolding `lib/llm/src/audit/mod.rs`	Introduces the audit module and publicly re-exports submodules: bus, config, handle, sink, stream.
Audit event bus `lib/llm/src/audit/bus.rs`	Adds a OnceLock-backed broadcast sender for `Arc<AuditRecord>` with `init(capacity)`, `subscribe()`, and `publish(AuditRecord)`.
Audit configuration `lib/llm/src/audit/config.rs`	Adds `AuditPolicy { enabled }`, a OnceLock policy store, `init_from_env()` reading `DYN_AUDIT_ENABLED`, and `policy()` accessor.
Audit handle and record types `lib/llm/src/audit/handle.rs`	Defines `AuditMode`, `AuditRecord`, `AuditHandle`, and `CompletionUsage` alias. Provides `create_handle(...)`, setters, and `emit()` to publish to the bus.
Audit sinks and workers `lib/llm/src/audit/sink.rs`	Introduces `AuditSink` trait and `StderrSink`. Parses `DYN_AUDIT_SINKS` and `spawn_workers_from_env()` to subscribe and emit records per sink.
Streaming passthrough and aggregation `lib/llm/src/audit/stream.rs`	Adds `PassThroughWithAgg<S>` stream wrapper collecting chunks and `scan_aggregate_with_future(...)` returning passthrough plus a future for aggregated response; includes tests.
Entrypoint wiring `lib/llm/src/entrypoint/input.rs`	On startup, if policy enabled: initializes audit bus with `DYN_AUDIT_CAPACITY` (default 1024), spawns sink workers, and logs capacity.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant EP as Entrypoint
  participant C as audit::config
  participant B as audit::bus
  participant S as audit::sink
  participant H as audit::handle
  participant App as App Logic

  EP->>C: policy()
  alt policy.enabled
    EP->>B: init(capacity from DYN_AUDIT_CAPACITY)
    EP->>S: spawn_workers_from_env()
    S->>B: subscribe()
    Note right of S: Workers ready to receive AuditRecord
  else
    Note over EP: Auditing disabled
  end

  App->>H: create_handle(req, request_id)
  alt Some(handle)
    App->>H: set_request(req) / add_usage(...)
    App->>H: set_response(resp)
    App->>H: emit()
    H->>B: publish(AuditRecord)
    B-->>S: broadcast Arc<AuditRecord>
    S->>S: emit(record) via each configured sink
  else None
    Note over App: Skip auditing
  end

sequenceDiagram
  autonumber
  participant Client as Client
  participant Stream as Upstream Stream
  participant PTA as PassThroughWithAgg
  participant Agg as Aggregator Task
  participant Fut as Aggregation Future

  Client->>PTA: poll_next()
  PTA->>Stream: poll_next()
  alt Next chunk
    Stream-->>PTA: Annotated<Chunk>
    PTA->>PTA: buffer clone
    PTA-->>Client: forward chunk
  else End of stream
    Stream-->>PTA: None
    PTA->>Agg: spawn aggregate(buffer)
    Agg-->>Fut: send final Response
    PTA-->>Client: None
  end

  Note over Fut: Future resolves to NvCreateChatCompletionResponse (fallback on failure)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

chore: remove flatten for chat response types, add reasoning_content #2543 — Touches chat completion aggregation types used by the new audit streaming/aggregation path.
feat: [vLLM] implement cli args for tool and reasoning parsers #2619 — Changes aggregation signatures (e.g., StreamArgs) invoked by scan_aggregate_with_future and DeltaAggregator.

Poem

A rabbit taps logs with a tiny paw,
Catching each whisper, each token it saw.
Streams trickle by, then gather and sing—
A record takes flight on broadcast wing.
Stderr glows softly: “audit complete.”
Hippity-hop—observability sweet.

Pre-merge checks

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 55.17% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.
Description Check	⚠️ Warning	The pull request description includes the Overview, Details, and a "Where to start" section, but it omits the required "Related Issues" section from the repository template and does not use the exact heading "#### Where should the reviewer start?" as specified, making it incomplete relative to the required structure.	Please add the missing "#### Related Issues" section with the correct action keyword (Closes/Fixes/Resolves) and issue reference, and rename the "Where to start:" heading to exactly "#### Where should the reviewer start?" to fully conform to the template.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title succinctly and accurately describes the primary change of adding audit logging for chat completions, aligning with the PR’s objective and providing clear context to reviewers.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🧪 Early access (Sonnet 4.5): enabled

We are currently testing the Sonnet 4.5 model, which is expected to improve code review quality. However, this model may lead to increased noise levels in the review comments. Please disable the early access features if the noise level causes any inconvenience.

Note:

Public repositories are always opted into early access features.
You can enable or disable early access features from the CodeRabbit UI or by updating the CodeRabbit configuration file.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (7)

lib/llm/src/http/service.rs (1)
31-31: Confirm need for public exposure of audit module

If external consumers don’t need to call auditing APIs, prefer restricting scope to avoid expanding the public surface.

Apply if internal-only:
-pub mod audit;
+pub(crate) mod audit;
lib/llm/src/http/service/openai.rs (2)
492-501: Audit gating and conditional clone: OK; consider deferring clone to success path

Current approach avoids clone unless needed. You could further defer the clone until after the response is folded to skip cloning on error paths (minor perf), but it would reflect stream=true in the logged request. If preserving the original stream flag matters, keep as-is.

597-601: Emit request/response as structured fields and add an explicit tracing target

With the current % formatter, request/response become stringified JSON. If your JSONL pipeline expects nested objects, serialize via a serde-aware field or attach a dedicated target for easier filtering.

Proposed minimal change (adds a target; keeps current field formatting). For fully structured fields, see follow-up note.
-        if let Some(req_copy) = request_for_audit {
-            let resp_json = serde_json::to_value(&response).unwrap_or(serde_json::Value::Null);
-            audit::log_stored_completion(&request_id, &req_copy, resp_json);
-        }
+        if let Some(req_copy) = request_for_audit {
+            let resp_json = serde_json::to_value(&response).unwrap_or(serde_json::Value::Null);
+            audit::log_stored_completion(&request_id, &req_copy, resp_json);
+        }
Follow-up (optional, requires tracing-serde and logger support): log as nested objects using AsSerde and a target (see audit.rs comment).
lib/llm/src/http/service/audit.rs (4)
26-31: Avoid potential panic and normalize timestamp type

duration_since(UNIX_EPOCH).unwrap() can theoretically panic; also prefer i64 for downstream JSON consumers.

Apply:
-    let ts_ms = SystemTime::now()
-        .duration_since(UNIX_EPOCH)
-        .unwrap()
-        .as_millis();
+    let ts_ms: i64 = SystemTime::now()
+        .duration_since(UNIX_EPOCH)
+        .map(|d| d.as_millis() as i64)
+        .unwrap_or(0);
32-41: Add a dedicated tracing target and (optionally) emit structured JSON fields

A target simplifies log routing. Today %request_val/%response_json serialize to strings; if your JSONL stack expects nested objects, emit serde-backed values.

Minimal (target only):
-    tracing::info!(
+    tracing::info!(target = "dynamo_audit",
         log_type = "audit",
         schema_version = "1.0",
         ts_ms = ts_ms,
         store_id = %store_id,
         request_id = request_id,
         request = %request_val,
         response = %response_json,
         "Audit log for stored completion"
     );
Optional (structured fields; requires adding tracing-serde and configuring the JSON formatter to honor it):
-    tracing::info!(target = "dynamo_audit",
-        request = %request_val,
-        response = %response_json,
-        ...
-    );
+    use tracing_serde::AsSerde;
+    tracing::info!(target = "dynamo_audit",
+        request = ?AsSerde(&request_val),
+        response = ?AsSerde(&response_json),
+        ...
+    );
Please confirm what your DYN_LOGGING_JSONL layer expects.

49-64: Tests cover flag matrix; consider one negative-path clone test (optional)

You might add a test asserting no request clone occurs when stream=true or store=false (using counters or a lightweight wrapper), but this is optional.

81-113: Smoke test is fine; consider asserting log shape via a test subscriber (optional)

If feasible, attach a test tracing subscriber to capture the event and assert presence of log_type="audit" and request_id.

I can draft a minimal test subscriber if you’d like.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bc29b59 and ebcadce.

📒 Files selected for processing (3)

lib/llm/src/http/service.rs (1 hunks)
lib/llm/src/http/service/audit.rs (1 hunks)
lib/llm/src/http/service/openai.rs (3 hunks)

🧰 Additional context used

🧬 Code graph analysis (2)

lib/llm/src/http/service/audit.rs (1)

lib/llm/src/http/service/openai.rs (2)

chat_completions (455-605)

s (61-61)

lib/llm/src/http/service/openai.rs (1)

lib/llm/src/http/service/audit.rs (2)

should_audit_flags (15-17)

log_stored_completion (19-42)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Build and Test - dynamo
GitHub Check: pre-merge-rust (lib/bindings/python)
GitHub Check: pre-merge-rust (.)
GitHub Check: pre-merge-rust (lib/runtime/examples)

🔇 Additional comments (3)

lib/llm/src/http/service/openai.rs (1)

27-27: Import looks good

lib/llm/src/http/service/audit.rs (2)

9-13: Env flag parsing: LGTM

Covers "1"/"true" (case-insensitive) and defaults off.

15-17: Audit gating logic is correct

Non-streaming + enabled + store=true is enforced.

ryanolson

is http the right place for this or is preprocess/processor?

https is just a shim/transport. if we put the auditing here, we need to do the same in the grpc frontend that @GuanLuo is doing/or finished.

the responses are finalized in the post part of the "processor".

that feels like a better place to audit since all frontend code paths (regardless of public api or transport) will flow through there.

lib/llm/src/http/service/audit.rs

lib/llm/src/http/service/openai.rs

ryan-lempka · 2025-09-16T19:18:27Z

is http the right place for this or is preprocess/processor?

https is just a shim/transport. if we put the auditing here, we need to do the same in the grpc frontend that @GuanLuo is doing/or finished.

the responses are finalized in the post part of the "processor".

that feels like a better place to audit since all frontend code paths (regardless of public api or transport) will flow through there.

@ryanolson thanks for the feedback - makes sense. I’ll work through the comments later this afternoon and move the logic into the processor.

ryan-lempka · 2025-09-17T22:38:13Z

@ryanolson ready for re-review. Let me know what you think of using annotations in this manner. Also this PR is scoped to non-streaming for now but I want to make sure the direction will align with streaming as well.

lib/llm/src/audit/config.rs

lib/llm/src/audit/log.rs

copy-pr-bot · 2025-09-30T19:47:13Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Signed-off-by: Ryan Lempka <[email protected]>

…o generate function Signed-off-by: Ryan Lempka <[email protected]>

Signed-off-by: Ryan Lempka <[email protected]>

ayushag-nv · 2025-09-30T21:13:17Z

/ok to test 048d6ef

Signed-off-by: Ryan Lempka <[email protected]>

ryan-lempka requested a review from a team as a code owner September 16, 2025 17:56

pull-request-size bot added the size/L label Sep 16, 2025

github-actions bot added the feat label Sep 16, 2025

ryan-lempka self-assigned this Sep 16, 2025

ryan-lempka changed the title ~~feat: add audit logging for chat completions~~ feat: add audit logging for chat completions non-streaming Sep 16, 2025

ryan-lempka requested a review from grahamking September 16, 2025 18:00

coderabbitai bot reviewed Sep 16, 2025

View reviewed changes

ryan-lempka requested a review from ryanolson September 16, 2025 18:55

ryanolson requested changes Sep 16, 2025

View reviewed changes

lib/llm/src/http/service/audit.rs Outdated Show resolved Hide resolved

lib/llm/src/http/service/openai.rs Outdated Show resolved Hide resolved

lib/llm/src/http/service/openai.rs Outdated Show resolved Hide resolved

copy-pr-bot bot temporarily deployed to GITLAB September 17, 2025 00:59 Inactive

copy-pr-bot bot temporarily deployed to GITLAB September 17, 2025 01:00 Inactive

pull-request-size bot added size/M and removed size/L labels Sep 17, 2025

copy-pr-bot bot temporarily deployed to GITLAB September 17, 2025 22:13 Inactive

copy-pr-bot bot temporarily deployed to GITLAB September 17, 2025 22:14 Inactive

ryan-lempka force-pushed the rlempka/log-stdout-store-true branch from 4eabe60 to 6bc8a07 Compare September 17, 2025 22:17

copy-pr-bot bot temporarily deployed to GITLAB September 17, 2025 22:17 Inactive

copy-pr-bot bot temporarily deployed to GITLAB September 17, 2025 22:19 Inactive

ryan-lempka force-pushed the rlempka/log-stdout-store-true branch from 6bc8a07 to cdab20e Compare September 17, 2025 22:26

pull-request-size bot added size/L and removed size/M labels Sep 17, 2025

copy-pr-bot bot temporarily deployed to GITLAB September 17, 2025 22:26 Inactive

copy-pr-bot bot temporarily deployed to GITLAB September 17, 2025 22:30 Inactive

ryan-lempka requested a review from rmccorm4 September 17, 2025 22:34

copy-pr-bot bot temporarily deployed to GITLAB September 18, 2025 17:30 Inactive

copy-pr-bot bot temporarily deployed to GITLAB September 18, 2025 17:31 Inactive

copy-pr-bot bot temporarily deployed to GITLAB September 18, 2025 21:41 Inactive

ryanolson approved these changes Sep 30, 2025

View reviewed changes

lib/llm/src/audit/config.rs Show resolved Hide resolved

lib/llm/src/audit/log.rs Outdated Show resolved Hide resolved

ryan-lempka force-pushed the rlempka/log-stdout-store-true branch from 548e5d4 to 6bcc3f3 Compare September 30, 2025 19:47

ryan-lempka force-pushed the rlempka/log-stdout-store-true branch from 6bcc3f3 to 9cc0140 Compare September 30, 2025 19:49

ryan-lempka added 15 commits September 30, 2025 19:52

feat: audit log to stdout store true

a5241d3

Signed-off-by: Ryan Lempka <[email protected]>

chore: clean up audit log schema

8923cb4

Signed-off-by: Ryan Lempka <[email protected]>

feat: non-streaming req/resp logging

b47e2ae

Signed-off-by: Ryan Lempka <[email protected]>

feat: initial refactor to transport agnostic audit

97034b2

Signed-off-by: Ryan Lempka <[email protected]>

feat: initial refactor with annotations instead of new hash map

df3b140

Signed-off-by: Ryan Lempka <[email protected]>

chore: improve error handling for request serialization

e0f69be

Signed-off-by: Ryan Lempka <[email protected]>

feat: initial second refactor to use bcast channel

917cc34

Signed-off-by: Ryan Lempka <[email protected]>

feat: third refactor to keep req/resp capture in preprocessor.rs

d2aab2b

Signed-off-by: Ryan Lempka <[email protected]>

feat: improve non-streaming agg logic and move stream=true hardcode t…

2b81c48

…o generate function Signed-off-by: Ryan Lempka <[email protected]>

fix: clippy

416f034

Signed-off-by: Ryan Lempka <[email protected]>

chore: add rc to root toml serde

2d35bc9

Signed-off-by: Ryan Lempka <[email protected]>

chore: set stream to true in completions generate function

32897b6

Signed-off-by: Ryan Lempka <[email protected]>

chore: sync Cargo.toml with main

5646306

Signed-off-by: Ryan Lempka <[email protected]>

chore: remove different modes

6f36917

Signed-off-by: Ryan Lempka <[email protected]>

chore: remove unused variable

048d6ef

Signed-off-by: Ryan Lempka <[email protected]>

ryan-lempka force-pushed the rlempka/log-stdout-store-true branch from 9cc0140 to 048d6ef Compare September 30, 2025 19:55

ryan-lempka enabled auto-merge (squash) September 30, 2025 19:56

ryan-lempka disabled auto-merge September 30, 2025 20:20

copy-pr-bot bot temporarily deployed to GITLAB September 30, 2025 21:13 Inactive

copy-pr-bot bot temporarily deployed to GITLAB September 30, 2025 21:14 Inactive

ryan-lempka enabled auto-merge (squash) September 30, 2025 21:22

ryan-lempka merged commit 56d20f5 into main Sep 30, 2025
25 of 26 checks passed

ryan-lempka deleted the rlempka/log-stdout-store-true branch September 30, 2025 21:48

ziqifan617 pushed a commit that referenced this pull request Oct 1, 2025

feat: add audit logging for chat completions (#3062)

0852c32

Signed-off-by: Ryan Lempka <[email protected]>

nv-tusharma pushed a commit that referenced this pull request Oct 20, 2025

feat: add audit logging for chat completions (#3062)

deb9aa2

Signed-off-by: Ryan Lempka <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add audit logging for chat completions #3062

feat: add audit logging for chat completions #3062

Uh oh!

ryan-lempka commented Sep 16, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Sep 16, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

ryanolson left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ryan-lempka commented Sep 16, 2025

Uh oh!

ryan-lempka commented Sep 17, 2025

Uh oh!

Uh oh!

Uh oh!

copy-pr-bot bot commented Sep 30, 2025

Uh oh!

ayushag-nv commented Sep 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat: add audit logging for chat completions #3062

feat: add audit logging for chat completions #3062

Uh oh!

Conversation

ryan-lempka commented Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Where to start:

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Pre-merge checks

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

ryanolson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ryan-lempka commented Sep 16, 2025

Uh oh!

ryan-lempka commented Sep 17, 2025

Uh oh!

Uh oh!

Uh oh!

copy-pr-bot bot commented Sep 30, 2025

Uh oh!

ayushag-nv commented Sep 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ryan-lempka commented Sep 16, 2025 •

edited

Loading

coderabbitai bot commented Sep 16, 2025 •

edited

Loading