Skip to main content
Lightweight enforces per-key rate limits measured in requests per minute (RPM). Your limit depends on your plan tier.

RPM Tiers

PlanRequests per Minute (RPM)
Beta60
Pro250
EnterpriseCustom
Rate limits apply per API key, not per model. All requests across all models count toward your RPM limit.

Handling Rate Limits

When you exceed your RPM limit, the API returns a 429 Too Many Requests response with a Retry-After header. Example response headers:
HTTP/1.1 429 Too Many Requests
Retry-After: 12
Content-Type: application/json
Example response body:
{
  "error": {
    "message": "Rate limit exceeded. Please retry after 12 seconds.",
    "type": "rate_limit_error",
    "code": 429
  }
}

Best Practices

  1. Read the Retry-After header — it tells you exactly how many seconds to wait before retrying.
  2. Implement exponential backoff — start with the Retry-After value, then double the wait time on consecutive failures.
  3. Queue requests — if you’re making many requests, use a queue with a rate limiter to stay under your RPM ceiling.
  4. Monitor usage — use GET /v1/usage or the Dashboard to track your consumption.
Need higher limits? Contact us for Enterprise plans with custom RPM quotas.