Guides

Rate Limits

How Bytekit enforces quotas, concurrency slots, and rate limits.

Bytekit enforces three independent limits: rate limits (requests per second), concurrency slots (simultaneous in-flight requests), and quota (monthly usage caps). All three are enforced before any capture work begins.

Quota model

Quota is pre-incremented on every request and corrected after the request completes:

  1. Pre-increment — your usage counter is incremented by an estimate before the capture starts. This prevents over-committing capacity even for concurrent requests.
  2. Adjust — once the response returns, the actual content_length (compressed wire bytes) replaces the estimate for bandwidth accounting.
  3. Compensating decrement — if the request fails with a terminal error, the pre-incremented amount is rolled back so you are not charged for failed requests.

This means your dashboard usage counter may briefly read higher than actual consumption during a burst of requests, then settle to the true value.

Concurrency slots

Each account has a concurrency limit — the maximum number of in-flight requests at one time. Slots are released automatically a few minutes after a client disconnects, so a crashed client cannot permanently hold them.

When you exceed your concurrency limit, you receive 429 rate_limited with a Retry-After header indicating how long to wait before retrying.

Rate limit headers

Every response includes headers that show your current position:

HeaderDescription
X-RateLimit-LimitYour account's request-per-second limit
X-RateLimit-RemainingRequests remaining in the current window
X-RateLimit-ResetUnix timestamp when the window resets
Retry-AfterSeconds to wait before retrying (present on 429 only)

Bandwidth metering

Bandwidth usage is measured as content_length — the compressed wire bytes returned by the origin server. If the origin sends gzip or Brotli-encoded content, the compressed size is what counts, not the decompressed size. This matches what your client actually receives over the network.

Handling 429 responses

When you receive 429 rate_limited:

  1. Read the Retry-After header value (in seconds).
  2. Wait at least that long before retrying.
  3. If no Retry-After is present, use exponential backoff starting at 1 second.
# Check rate limit headers
curl -I -X POST https://api.bytekit.com/v1/scrape \
  -H "Authorization: Bearer sk_live_YOUR_KEY_HERE" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'

Next steps

  • Errors — full error code reference and retry guidance
  • Scraping — how quota interacts with the fast and slow paths