Rate Limits

Lightweight enforces per-key rate limits measured in requests per minute (RPM). Your limit depends on your plan tier.

RPM Tiers

Rate limits apply per API key, not per model. All requests across all models count toward your RPM limit.

When you exceed your RPM limit, the API returns a 429 Too Many Requests response with a Retry-After header. Example response headers:

HTTP/1.1 429 Too Many Requests
Retry-After: 12
Content-Type: application/json

Example response body:

{
  "error": {
    "message": "Rate limit exceeded. Please retry after 12 seconds.",
    "type": "rate_limit_error",
    "code": 429
  }
}

Read the Retry-After header — it tells you exactly how many seconds to wait before retrying.
Implement exponential backoff — start with the Retry-After value, then double the wait time on consecutive failures.
Queue requests — if you’re making many requests, use a queue with a rate limiter to stay under your RPM ceiling.
Monitor usage — use GET /v1/usage or the Dashboard to track your consumption.

Need higher limits? Contact us for Enterprise plans with custom RPM quotas.