Skip to content

Conversation

rgerganov
Copy link
Collaborator

When serving the OpenAI compatible API, we should check if {"stream_options": {"include_usage": true} is set in the request when deciding whether we should send usage statistics

closes: #16048

When serving the OpenAI compatible API, we should check if
{"stream_options": {"include_usage": true} is set in the request when
deciding whether we should send usage statistics

closes: ggml-org#16048
@github-actions github-actions bot added the python python script changes label Sep 17, 2025
@ngxson
Copy link
Collaborator

ngxson commented Sep 17, 2025

Not sure if this change may affect the new web ui, could you check @allozaur ?

And btw I think this should be mentioned in server's changelog as it's technically a breaking change.

@allozaur
Copy link
Collaborator

Not sure if this change may affect the new web ui, could you check @allozaur ?

And btw I think this should be mentioned in server's changelog as it's technically a breaking change.

I will check it tomorrow

@rgerganov
Copy link
Collaborator Author

And btw I think this should be mentioned in server's changelog as it's technically a breaking change.

Sure. It seems we missed that for PR #15444 which introduced this new way of sending usage statistics.

@allozaur
Copy link
Collaborator

@ngxson @rgerganov

I've tested with just running.

build/bin/llama-server -hf ggml-org/gpt-oss-20b-GGUF --jinja -c 0

I didn't see anything out of ordinary when testing webui on this branch.

Copy link
Collaborator

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good then, thanks for testing. I added an entry to #9291

@rgerganov rgerganov merged commit 2b6b55a into ggml-org:master Sep 18, 2025
49 of 50 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
examples python python script changes server
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Misc. bug: server is always sending usage statistic
4 participants