RPM Tiers
| Plan | Requests per Minute (RPM) |
|---|---|
| Beta | 60 |
| Pro | 250 |
| Enterprise | Custom |
Rate limits apply per API key, not per model. All requests across all models count toward your RPM limit.
Handling Rate Limits
When you exceed your RPM limit, the API returns a429 Too Many Requests response with a Retry-After header.
Example response headers:
Best Practices
- Read the
Retry-Afterheader — it tells you exactly how many seconds to wait before retrying. - Implement exponential backoff — start with the
Retry-Aftervalue, then double the wait time on consecutive failures. - Queue requests — if you’re making many requests, use a queue with a rate limiter to stay under your RPM ceiling.
- Monitor usage — use
GET /v1/usageor the Dashboard to track your consumption.