Model | Requests per minute | Tokens per minute |
|---|---|---|
Llama-4-Maverick-17B-128E-Instruct-FP8 | 10 | 250,000 |
Llama-4-Scout-17B-16E-Instruct-FP8 | 10 | 250,000 |
Llama-3.3-70B-Instruct | 10 | 250,000 |
Llama-3.3-8B-Instruct | 10 | 250,000 |
HTTP 429 error, and an error message indicating that too many requests have been made. This means that you have exceeded your team’s rate limit, and cannot make more API requests until your usage falls below the limits for your team.The API will stop responding with errors once the RPM or TPM rates are no longer being exceeded.Header | Description |
|---|---|
x-ratelimit-limit-tokens | The total number of tokens for the token limit |
x-ratelimit-remaining-tokens | The remaining number of tokens you can use before hitting the token limit |
x-ratelimit-limit-requests | The total number of requests for the requests limit |
x-ratelimit-remaining-requests | The remaining number of requests you can make before hitting the requests limit |