Rate Limits

The public API protects shared infrastructure with traffic-shaping policies. Integrations should treat rate limits as part of the runtime contract — not as exceptional failures — and react with backoff and retry.

What To Expect

When this concept lands, the layer will cover:

  • HTTP 429 Too Many Requests — the canonical response when a caller exceeds an applicable limit. The body uses the standard error envelope with a stable code.
  • Retry-After header — present on 429 responses, indicating the minimum seconds before the next attempt has a chance of succeeding.
  • X-RateLimit-* headers — exposed on every response (success or failure) with the active limit, the remaining budget in the current window, and when the window resets.
  • Per-actor scoping — limits apply per service principal / per human actor, so one noisy integration does not consume a tenant's whole budget.

Until Then

Until concrete numbers are published, integrations should already plan for:

  • Exponential backoff with jitter on 429, 502, 503. Start at ~250 ms, cap at a few seconds, add randomness.
  • Idempotency keys on retries so a retried POST does not duplicate the operation. See Idempotency And Concurrency.
  • Bounded concurrency in your client — most rate limits punish parallelism more than throughput.