The API has two types of limits:
- Rate limits — requests per minute, per endpoint
- Request quotas — total requests per billing period
Rate limits
Rate limits prevent abuse and ensure fair usage. They use a rolling (sliding) 60-second window.
Per-endpoint limits
| Endpoint | Limit | Window |
|---|
POST /api/v1/generate | 500 requests | 1 minute |
POST /api/v1/preview | 300 requests | 1 minute |
GET /api/v1/fonts | 100 requests | 1 minute |
GET /api/v1/status does not require authentication and is not rate-limited per API key.
How rate limits work
- Limits are tracked per API key, per endpoint
- Each endpoint has its own independent counter
- The window is rolling (not aligned to the clock minute)
When you hit a rate limit
The API returns 429 Too Many Requests:
{
"error": "RATE_LIMIT_EXCEEDED",
"message": "Rate limit exceeded for /generate: 500 requests per 1 minute(s)",
"limit": 500,
"endpoint": "/generate",
"retry_after": 45
}
Response includes:
Retry-After header with seconds to wait
X-RateLimit-Reset header with reset timestamp
Request quotas
Request quotas are monthly limits on billable requests.
What counts toward quota
Only successful /generate requests count:
| Action | Counts? |
|---|
POST /generate (success) | Yes |
POST /generate (error) | No |
POST /preview | No |
GET /fonts | No |
GET /status | No |
When you hit quota
The API returns 429 with a different error code:
{
"error": "REQUEST_LIMIT_EXCEEDED",
"message": "Request limit exceeded: 10000/10000 monthly requests used",
"limit": 10000,
"used": 10000,
"period": "monthly",
"reset_at": "2024-02-01T00:00:00Z"
}
Checking your quota
View your current usage in the dashboard under Usage.
Order of checks
Limits are checked in this order:
For POST /api/v1/generate
- Authentication — Is the API key valid?
- Request quota — Are you within monthly quota?
- Rate limit — Are you within per-minute limits?
- Request processing — Generate the output
For other authenticated endpoints
- Authentication
- Rate limit
- Request processing
A 429 from rate limiting is different from quota — check the error field.
On /generate, quota is checked before rate limiting. A quota 429 may not include Retry-After or X-RateLimit-* headers.
Best practices
Avoid rate limits
- Use
/preview for testing (free, watermarked)
- Batch requests when possible
- Implement client-side request queuing
Handle 429 responses
import time
def generate_with_backoff(request_data, max_retries=3):
for attempt in range(max_retries):
response = api.generate(request_data)
if response.status_code == 429:
retry_after = int(response.headers.get('Retry-After', 60))
time.sleep(retry_after)
continue
return response
raise Exception("Rate limit exceeded after retries")
Monitor usage
- Check
X-RateLimit-Remaining header on responses
- Monitor usage in dashboard before month end
- Set up alerts for high usage