Rate Limits
API limits are applied per API key. The current tier is controlled by apikey.rate_limit_tier.
Credits decide how many generations a user can pay for. API tiers decide how fast the user can call the API and how many tasks can run at the same time.
Tiers
| Tier | Create task | Query task | Active tasks |
|---|---|---|---|
| free | 1/s, 10/min | 2/s, 60/min | 1 |
| basic | 5/s, 180/min | 20/s, 1000/min | 5 |
| standard | 10/s, 600/min | 50/s, 3000/min | 15 |
| pro | 20/s, 1200/min | 100/s, 6000/min | 30 |
| ultra | 50/s, 3000/min | 200/s, 12000/min | 80 |
| enterprise | 100/s, 6000/min | 500/s, 30000/min | 300 |
Limit behavior
The API checks both a 1-second window and a 60-second window.
If a request exceeds either window, it returns:
{
"code": 429,
"msg": "rate limit exceeded",
"data": {
"retryAfter": 17,
"limit": 180,
"tier": "basic",
"windowSeconds": 60
}
}Rejected requests do not enter the generation queue.
Enterprise overrides
Enterprise keys can override defaults with apikey.scopes.apiLimits.
{
"apiLimits": {
"activeTasks": 500,
"endpoints": {
"jobs.createTask": {
"perSecond": 150,
"perMinute": 9000
},
"jobs.recordInfo": {
"perSecond": 800,
"perMinute": 48000
}
}
}
}High enterprise traffic should use infrastructure designed for high-frequency counters, such as Cloudflare Durable Objects, KV, or Redis.