Add ability to configure rate limiting

## Motivation

I want to use a certain model for my task, and I am hitting the rate limit. I found the doc page: [Set LLM Rate Limits](https://block.github.io/goose/docs/guides/handling-llm-rate-limits-with-goose/), but the only fix for the problem that it offers is either purchasing an upgraded plan option, or switching to OpenRouter (which is really cool by the way). 


## Desired Solution

I would like to be able to just run my task more slowly instead of having to switch models because I am hitting a rate limit. The model I am hitting the limit on is GPT5, and the API is specifically telling me that I have a limit of 30000 tokens per minute (TPM). It doesn't seem unreasonable to me to have a TPM setting within Goose, so one could just explicitly set the rate limit. 

## Alternatives considered

I haven't really considered any, because I think this is a pretty straightforward and reasonable solution.

- [x] I have verified this does not duplicate an existing feature request


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add ability to configure rate limiting #4287

Motivation

Desired Solution

Alternatives considered

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add ability to configure rate limiting #4287

Description

Motivation

Desired Solution

Alternatives considered

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions