Skip to content

Conversation

@spoons-and-mirrors
Copy link
Contributor

@spoons-and-mirrors spoons-and-mirrors commented Aug 19, 2025

Summary

Rate limit handing is currently missing. This PR introduces per provider/model pair rate limiting through the config file, adding the rpm field to limit.

Implementation is simple, it sleeps the request that would cause limiting in order to stay in the "message flow" so you don' t have to wait and re-prompt the model after you've hit limits.

"provider": {
    "google": {
      "models": {
        "gemini-2.5-pro":{
          "limit":{
            "rpm": 10
          }
        }
      }
    }
  }

The status bar has also been updated to show the ETA of the next request when being limited

image

Notes

I'm unsure if rpm should be nested under a rate object or not @thdxr ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants