Rate Limits

API limits are applied per API key. The current tier is controlled by apikey.rate_limit_tier.

Credits decide how many generations a user can pay for. API tiers decide how fast the user can call the API and how many tasks can run at the same time.

Tiers

Tier	Create task	Query task	Active tasks
free	1/s, 10/min	2/s, 60/min	1
basic	5/s, 180/min	20/s, 1000/min	5
standard	10/s, 600/min	50/s, 3000/min	15
pro	20/s, 1200/min	100/s, 6000/min	30
ultra	50/s, 3000/min	200/s, 12000/min	80
enterprise	100/s, 6000/min	500/s, 30000/min	300

Limit behavior

The API checks both a 1-second window and a 60-second window.

If a request exceeds either window, it returns:

{
  "code": 429,
  "msg": "rate limit exceeded",
  "data": {
    "retryAfter": 17,
    "limit": 180,
    "tier": "basic",
    "windowSeconds": 60
  }
}

Rejected requests do not enter the generation queue.

Enterprise overrides

Enterprise keys can override defaults with apikey.scopes.apiLimits.

{
  "apiLimits": {
    "activeTasks": 500,
    "endpoints": {
      "jobs.createTask": {
        "perSecond": 150,
        "perMinute": 9000
      },
      "jobs.recordInfo": {
        "perSecond": 800,
        "perMinute": 48000
      }
    }
  }
}

High enterprise traffic should use infrastructure designed for high-frequency counters, such as Cloudflare Durable Objects, KV, or Redis.

Rate Limits

Tiers

Limit behavior

Enterprise overrides

On this page