Skip to main content
The API has two types of limits:
  1. Rate limits — requests per minute, per endpoint
  2. Request quotas — total requests per billing period

Rate limits

Rate limits prevent abuse and ensure fair usage. They use a rolling (sliding) 60-second window.

Per-endpoint limits

EndpointLimitWindow
POST /api/v1/generate500 requests1 minute
POST /api/v1/preview300 requests1 minute
GET /api/v1/fonts100 requests1 minute
GET /api/v1/status does not require authentication and is not rate-limited per API key.

How rate limits work

  • Limits are tracked per API key, per endpoint
  • Each endpoint has its own independent counter
  • The window is rolling (not aligned to the clock minute)

When you hit a rate limit

The API returns 429 Too Many Requests:
{
  "error": "RATE_LIMIT_EXCEEDED",
  "message": "Rate limit exceeded for /generate: 500 requests per 1 minute(s)",
  "limit": 500,
  "endpoint": "/generate",
  "retry_after": 45
}
Response includes:
  • Retry-After header with seconds to wait
  • X-RateLimit-Reset header with reset timestamp

Request quotas

Request quotas are monthly limits on billable requests.

What counts toward quota

Only successful /generate requests count:
ActionCounts?
POST /generate (success)Yes
POST /generate (error)No
POST /previewNo
GET /fontsNo
GET /statusNo

When you hit quota

The API returns 429 with a different error code:
{
  "error": "REQUEST_LIMIT_EXCEEDED",
  "message": "Request limit exceeded: 10000/10000 monthly requests used",
  "limit": 10000,
  "used": 10000,
  "period": "monthly",
  "reset_at": "2024-02-01T00:00:00Z"
}

Checking your quota

View your current usage in the dashboard under Usage.

Order of checks

Limits are checked in this order:

For POST /api/v1/generate

  1. Authentication — Is the API key valid?
  2. Request quota — Are you within monthly quota?
  3. Rate limit — Are you within per-minute limits?
  4. Request processing — Generate the output

For other authenticated endpoints

  1. Authentication
  2. Rate limit
  3. Request processing
A 429 from rate limiting is different from quota — check the error field.
On /generate, quota is checked before rate limiting. A quota 429 may not include Retry-After or X-RateLimit-* headers.

Best practices

Avoid rate limits

  • Use /preview for testing (free, watermarked)
  • Batch requests when possible
  • Implement client-side request queuing

Handle 429 responses

import time

def generate_with_backoff(request_data, max_retries=3):
    for attempt in range(max_retries):
        response = api.generate(request_data)

        if response.status_code == 429:
            retry_after = int(response.headers.get('Retry-After', 60))
            time.sleep(retry_after)
            continue

        return response

    raise Exception("Rate limit exceeded after retries")

Monitor usage

  • Check X-RateLimit-Remaining header on responses
  • Monitor usage in dashboard before month end
  • Set up alerts for high usage