Rate Limits
How Bytekit enforces quotas, concurrency slots, and rate limits.
Bytekit enforces three independent limits: rate limits (requests per second), concurrency slots (simultaneous in-flight requests), and quota (monthly usage caps). All three are enforced before any capture work begins.
Quota model
Quota is pre-incremented on every request and corrected after the request completes:
- Pre-increment — your usage counter is incremented by an estimate before the capture starts. This prevents over-committing capacity even for concurrent requests.
- Adjust — once the response returns, the actual
content_length(compressed wire bytes) replaces the estimate for bandwidth accounting. - Compensating decrement — if the request fails with a terminal error, the pre-incremented amount is rolled back so you are not charged for failed requests.
This means your dashboard usage counter may briefly read higher than actual consumption during a burst of requests, then settle to the true value.
Concurrency slots
Each account has a concurrency limit — the maximum number of in-flight requests at one time. Slots are released automatically a few minutes after a client disconnects, so a crashed client cannot permanently hold them.
When you exceed your concurrency limit, you receive 429 rate_limited with a Retry-After
header indicating how long to wait before retrying.
Rate limit headers
Every response includes headers that show your current position:
| Header | Description |
|---|---|
X-RateLimit-Limit | Your account's request-per-second limit |
X-RateLimit-Remaining | Requests remaining in the current window |
X-RateLimit-Reset | Unix timestamp when the window resets |
Retry-After | Seconds to wait before retrying (present on 429 only) |
Bandwidth metering
Bandwidth usage is measured as content_length — the compressed wire bytes returned by the
origin server. If the origin sends gzip or Brotli-encoded content, the compressed size is what
counts, not the decompressed size. This matches what your client actually receives over the
network.
Handling 429 responses
When you receive 429 rate_limited:
- Read the
Retry-Afterheader value (in seconds). - Wait at least that long before retrying.
- If no
Retry-Afteris present, use exponential backoff starting at 1 second.
# Check rate limit headers
curl -I -X POST https://api.bytekit.com/v1/scrape \
-H "Authorization: Bearer sk_live_YOUR_KEY_HERE" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com"}'