Rate Limits

Limits by plan (Starter/Pro/Enterprise), rate limit headers, handling 429 responses.

NFYio enforces rate limits to ensure fair usage and system stability. Limits vary by plan. When exceeded, the API returns 429 Too Many Requests. Use the response headers to implement backoff and retry logic.

Limits by Plan

Starter

ResourceLimitWindow
API requests1,000 requestsper minute
Storage API500 requestsper minute
Agent chat100 messagesper minute
Agent queries50 queriesper minute
Buckets10 bucketstotal
Objects10,000 objectstotal

Pro

ResourceLimitWindow
API requests10,000 requestsper minute
Storage API5,000 requestsper minute
Agent chat500 messagesper minute
Agent queries200 queriesper minute
Buckets100 bucketstotal
Objects1,000,000 objectstotal

Enterprise

ResourceLimitWindow
API requestsCustom (contact sales)per minute
Storage APICustomper minute
Agent chatCustomper minute
Agent queriesCustomper minute
BucketsUnlimited
ObjectsUnlimited

Enterprise plans can configure custom limits per organization.

Rate Limit Headers

Every API response includes headers indicating current usage:

HeaderDescription
X-RateLimit-LimitMaximum requests allowed in the window
X-RateLimit-RemainingRequests remaining in current window
X-RateLimit-ResetUnix timestamp when the window resets
Retry-AfterSeconds to wait before retry (on 429)

Example Response Headers

X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 847
X-RateLimit-Reset: 1709304660

When you hit the limit:

HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1709304720
Retry-After: 60

{
  "error": {
    "code": "RateLimitExceeded",
    "message": "Rate limit exceeded. Retry after 60 seconds.",
    "details": {
      "retry_after": 60,
      "limit": 1000,
      "window": "1m"
    }
  }
}

Handling 429 Responses

1. Respect Retry-After

The Retry-After header (or details.retry_after in the JSON body) tells you how long to wait:

async function fetchWithRateLimitHandling(url, options) {
  const response = await fetch(url, options);

  if (response.status === 429) {
    const retryAfter = response.headers.get('Retry-After') || 60;
    console.warn(`Rate limited. Retrying after ${retryAfter}s`);
    await new Promise((r) => setTimeout(r, retryAfter * 1000));
    return fetchWithRateLimitHandling(url, options); // Retry
  }

  return response;
}

2. Proactive Throttling

Track X-RateLimit-Remaining and slow down before hitting the limit:

let remaining = 1000;

async function throttledFetch(url, options) {
  if (remaining < 10) {
    const reset = parseInt(response.headers.get('X-RateLimit-Reset'), 10);
    const waitMs = (reset * 1000) - Date.now();
    await new Promise((r) => setTimeout(r, Math.max(0, waitMs)));
  }
  const response = await fetch(url, options);
  remaining = parseInt(response.headers.get('X-RateLimit-Remaining'), 10) ?? remaining;
  return response;
}

3. Batch Requests

Reduce request count by batching where possible:

// Instead of 100 separate GETs
const objects = await Promise.all(
  keys.map((key) => s3.getObject({ Bucket, Key: key }))
);

// Use ListObjects with prefix, or batch APIs if available
const { Contents } = await s3.listObjectsV2({ Bucket, Prefix: 'folder/' });

Limits by Endpoint Type

Some endpoints have stricter limits:

Endpoint TypeStarterProEnterprise
Auth (login/token)20/min100/minCustom
Embedding trigger5/min20/minCustom
VPC operations50/min200/minCustom

Best Practices

  1. Cache responses — Reduce redundant API calls with client-side caching
  2. Use webhooks — Prefer webhooks over polling where supported
  3. Implement backoff — Always respect Retry-After on 429
  4. Monitor usage — Track X-RateLimit-Remaining in logs or dashboards
  5. Upgrade if needed — Pro/Enterprise plans offer higher limits

Next Steps