-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Description
Motivation
I want to use a certain model for my task, and I am hitting the rate limit. I found the doc page: Set LLM Rate Limits, but the only fix for the problem that it offers is either purchasing an upgraded plan option, or switching to OpenRouter (which is really cool by the way).
Desired Solution
I would like to be able to just run my task more slowly instead of having to switch models because I am hitting a rate limit. The model I am hitting the limit on is GPT5, and the API is specifically telling me that I have a limit of 30000 tokens per minute (TPM). It doesn't seem unreasonable to me to have a TPM setting within Goose, so one could just explicitly set the rate limit.
Alternatives considered
I haven't really considered any, because I think this is a pretty straightforward and reasonable solution.
- I have verified this does not duplicate an existing feature request