URL: https://rapidcrawl.dev/docs/changelog
# Changelog
Release history for the Bytekit API and SDKs
No releases yet. Check back after the first version ships.
---
URL: https://rapidcrawl.dev/docs/
# Bytekit
The web data API for AI agents and pipelines — scrape, screenshot, record, and monitor any URL through one endpoint.
Bytekit is a REST API for capturing what a webpage looks like and what it says. One key, one
base URL: scrape content, take screenshots, record scrolling video, discover sitemaps, and watch
URLs for changes — with each request automatically routed through the right rendering path.
## built-for-agents
Point your AI agent at the documentation index for a complete, machine-readable map of the API.
---
URL: https://rapidcrawl.dev/docs/introduction
# Introduction
A REST API for screenshots, scraping, and visual change detection that routes each request through the right rendering pipeline automatically.
## what-bytekit-is
Bytekit is a REST API for capturing what a webpage looks like and what it says —
screenshots, scrolling recordings, raw HTML or clean markdown, and visual change detection.
One API key, one base URL, every output you'd otherwise stitch together from a headless-browser
farm and a scraping proxy.
## what-bytekit-is
The differentiator isn't a feature — it's the routing. Most scraping APIs either fetch HTML
over HTTP (fast, but brittle on JS-heavy pages and easily blocked) or render every request in a
full browser (slower and more expensive, but works on more sites). Bytekit does both, in
that order, automatically. Simple URLs come back in ~500–800ms. JS-heavy URLs, or pages behind
bot protection, fall through to full browser rendering. Same request, same response shape — your
code doesn't change.
## what-you-can-do-with-it
Scrape — Fetch a URL as raw HTML, clean markdown, or structured content.
## what-you-can-do-with-it
Screenshot — Capture full-page or viewport PNG/JPEG/WebP.
## what-you-can-do-with-it
Record — Generate a scrolling video of a page.
## what-you-can-do-with-it
Monitor — Watch a URL on a schedule and webhook you when the rendered output changes.
## what-you-can-do-with-it
Bulk — Fan out thousands of URLs in parallel and receive a webhook as each finishes.
## what-you-can-do-with-it
Sitemap — Discover URLs from a domain's sitemap or by light crawling.
## how-requests-flow
Every capture endpoint follows the same routing model:
## how-requests-flow
Fast path. Simple requests are fetched directly over a geo-routed proxy and returned in
around 500–800ms (median). The response carries X-Scrape-* headers describing how it was
handled.
## how-requests-flow
Slow path. Browser-only options — wait_for_selector, wait_until: networkidle, custom
cookies or headers, any non-zero delay_ms — render the page in a full browser. Typically
3–15 seconds.
## how-requests-flow
Automatic retry. When a direct fetch is blocked by bot protection, Bytekit
transparently retries the request in a browser that can clear common challenges. You see the
success; you don't write the retry.
## how-requests-flow
The country option routes through a geo-located proxy on the fast path — it does not force a
browser render on its own. And any capture call accepts "async": true, in which case the API
returns 202 with an ID like sc_…, ss_…, or rec_… that you poll or receive via webhook.
## how-requests-flow
See the Scraping guide for the full trigger table and fallback
semantics.
## authentication-and-accounts
Every request carries a Bearer API key. Keys are prefixed sk_live_ for production and
sk_test_ for staging. Manage them from the dashboard — up to 50 active keys per account, with
per-key billing attribution so you can split usage cleanly across services or environments.
## authentication-and-accounts
Full details in the Authentication guide.
## where-to-go-next
Quickstart — first call in under five minutes, no SDK required.
## where-to-go-next
Client Libraries — official TypeScript and Python SDKs.
## where-to-go-next
Scraping — fast vs slow path, automatic browser retry, async polling.
## where-to-go-next
Rate Limits — quota model, concurrency slots, response headers.
## where-to-go-next
Monitors — schedule-based change detection with webhooks.
## where-to-go-next
API Reference — every endpoint, request, and response.
---
URL: https://rapidcrawl.dev/docs/quickstart
# Quickstart
Make your first Bytekit API call in under 5 minutes — no SDK required.
Nothing to install. You only need curl and an API key.
## get-an-api-key
Sign up at app.bytekit.com and create an API key from the
dashboard. Keys are prefixed sk_live_ for production and sk_test_ for staging.
## get-an-api-key
See Authentication for details on key formats and management.
## make-your-first-scrape
Use the staging endpoint with the test key to try it without affecting production quotas:
## make-your-first-scrape
A successful response looks like this:
## make-your-first-scrape
content_length is the compressed wire size in bytes (used for bandwidth billing). The id
field uses the sc_ prefix — you can poll GET /v1/scrape/{'{id}'} if the scrape was queued
asynchronously.
## call-production
When you are ready to hit production, swap the base URL and use your live key:
## prefer-an-sdk
Skip the raw HTTP and use an official client:
## prefer-an-sdk
TypeScript SDK — npm install @rapidcrawl/sdk
## prefer-an-sdk
Python SDK — pip install rapidcrawl-python
## next-steps
Authentication — key formats, Bearer header, self-serve key management
## next-steps
Scraping — fast path vs slow path, when each triggers, automatic browser retry
## next-steps
Errors — error shape, common codes, retry guidance
## next-steps
Rate Limits — quota model, concurrency slots, rate limit headers
## next-steps
Monitors — visual change detection with webhook notifications
## next-steps
API Reference — full endpoint documentation
---
URL: https://rapidcrawl.dev/docs/api
# API Reference
Every Bytekit endpoint, request, and response.
The Bytekit REST API is organized by capability. Every endpoint shares one base
URL and Bearer authentication — start with the Quickstart and
Authentication guides.
---
URL: https://rapidcrawl.dev/docs/cli/account
# Account
Account and API key operations
---
URL: https://rapidcrawl.dev/docs/cli/bulk
# Bulk
Bulk screenshot operations
---
URL: https://rapidcrawl.dev/docs/cli/fetch
# Fetch
Fetch URL content operations
---
URL: https://rapidcrawl.dev/docs/cli/monitors
# Monitors
Visual-change monitor operations
---
URL: https://rapidcrawl.dev/docs/cli/recordings
# Recordings
Recording operations
---
URL: https://rapidcrawl.dev/docs/cli/scrape
# Scrape
Scrape URL content
---
URL: https://rapidcrawl.dev/docs/cli/screenshots
# Screenshots
Screenshot operations
---
URL: https://rapidcrawl.dev/docs/cli/sitemap
# Sitemap
Sitemap crawl operations
---
URL: https://rapidcrawl.dev/docs/guides/authentication
# Authentication
How to authenticate with the Bytekit API using API keys.
Bytekit authenticates requests with Bearer API keys. Every request to a capture or data
endpoint must include an Authorization header.
## api-key-format
Keys are prefixed to make them easy to identify:
## api-key-format
Prefix
## api-key-format
Environment
## api-key-format
sk_test_
## api-key-format
Staging (api-stg.bytekit.com)
## api-key-format
sk_live_
## api-key-format
Production (api.bytekit.com)
## api-key-format
Never use a staging key against the production endpoint or vice versa — it will be rejected
with 401 unauthorized.
## sending-the-bearer-header
Include the key in every request:
## sending-the-bearer-header
The key is stored as a SHA-256 hash on the server. The plaintext value is only ever visible
once — at creation time.
## managing-api-keys
You can create and manage up to 50 active keys per account through the self-serve API.
## create-a-key
The response includes the plaintext key field exactly once. Store it immediately — it
cannot be retrieved again.
## list-keys
The list response omits the plaintext key — only the id, name, and metadata are returned.
## revoke-a-key
DELETE is idempotent — revoking an already-revoked key returns 204 No Content.
## key-limits
Maximum 50 active keys per account.
## key-limits
Attempting to create a 51st active key returns 409 with error code api_key_limit_reached. Revoke an existing key before creating a new one.
## key-limits
Revoked keys do not count toward the limit.
## next-steps
Quickstart — make your first API call
## next-steps
API Reference: Account — full key management endpoint docs
---
URL: https://rapidcrawl.dev/docs/guides/errors
# Errors
Bytekit error response shape, common error codes, and retry guidance.
All errors from the Bytekit API follow a consistent envelope shape so you can handle them
programmatically.
## error-response-shape
Field
## error-response-shape
Description
## error-response-shape
error.code
## error-response-shape
Machine-readable string identifier
## error-response-shape
error.message
## error-response-shape
Human-readable explanation
## error-response-shape
error.details
## error-response-shape
Optional object with additional context (e.g. which field failed validation)
## common-error-codes
HTTP status
## common-error-codes
Code
## common-error-codes
Meaning
## common-error-codes
400
## common-error-codes
invalid_request
## common-error-codes
Malformed JSON, missing required field, or failed validation
## common-error-codes
401
## common-error-codes
unauthorized
## common-error-codes
Missing or invalid API key
## common-error-codes
402
## common-error-codes
quota_exceeded
## common-error-codes
Account has exceeded its plan quota
## common-error-codes
403
## common-error-codes
forbidden
## common-error-codes
Key is valid but lacks permission for this action
## common-error-codes
429
## common-error-codes
rate_limited
## common-error-codes
Too many requests; back off and retry
## common-error-codes
502
## common-error-codes
internal_error
## common-error-codes
The upstream fetch or browser render failed
## webhook-urls-must-use-https
All webhook URLs submitted to the Bytekit API must use HTTPS. Plain HTTP URLs (e.g. http://example.com/webhook) are rejected with a 422 Unprocessable Entity error.
## webhook-urls-must-use-https
Why HTTPS only? Webhook payloads may contain sensitive data (API keys, session tokens, scrape results). HTTPS encryption protects this data in transit.
## webhook-urls-must-use-https
Request with an invalid webhook URL:
## webhook-urls-must-use-https
Response (HTTP 422):
## webhook-urls-must-use-https
Working example:
## webhook-urls-must-use-https
This applies to all endpoints that accept a webhook_url parameter: /v1/scrape (async), /v1/scrape/bulk, /v1/fetch/bulk, /v1/bulk, /v1/monitors, and /v1/sitemap.
## retry-guidance
Not all errors are retryable. Use this table to decide:
## retry-guidance
Code
## retry-guidance
Retry?
## retry-guidance
Guidance
## retry-guidance
invalid_request
## retry-guidance
No
## retry-guidance
Fix the request before retrying
## retry-guidance
unauthorized
## retry-guidance
No
## retry-guidance
Check your API key
## retry-guidance
quota_exceeded
## retry-guidance
No
## retry-guidance
Upgrade your plan or wait for quota reset
## retry-guidance
forbidden
## retry-guidance
No
## retry-guidance
You do not have access to this resource
## retry-guidance
rate_limited
## retry-guidance
Yes
## retry-guidance
Exponential backoff; honour the Retry-After header
## retry-guidance
internal_error
## retry-guidance
Yes
## retry-guidance
Exponential backoff; most resolve on retry
## retry-guidance
For retriable errors, start with a 1-second delay and double on each subsequent failure, up to
a maximum of 60 seconds. Respect the Retry-After header value when it is present on 429
responses.
## next-steps
Rate Limits — quota model, concurrency slots, rate limit headers
## next-steps
Scraping — when 502 internal_error occurs and how fallback works
## next-steps
API Reference — per-endpoint error codes
---
URL: https://rapidcrawl.dev/docs/guides/monitors
# Monitors
Detect visual changes on any URL and receive webhook notifications.
Monitors periodically capture a URL and notify you when the page content changes. Use them
for price tracking, content alerts, uptime monitoring, or competitor surveillance.
## create-a-monitor
Field
## create-a-monitor
Description
## create-a-monitor
url
## create-a-monitor
The page to monitor
## create-a-monitor
interval
## create-a-monitor
Check frequency in seconds (e.g. 3600 = every hour)
## create-a-monitor
notify_url
## create-a-monitor
Webhook endpoint to receive change notifications
## create-a-monitor
A successful response returns the monitor object with an id prefixed mon_:
## monitor-captures
Each time a monitor fires and detects a change, it records a capture event. Retrieve the
history with:
## monitor-captures
Each capture in the list represents one detected change event, including the timestamp and a
diff of what changed.
## webhook-payload
When a change is detected, Bytekit sends a POST to your notify_url with a JSON body:
## webhook-payload
Field
## webhook-payload
Description
## webhook-payload
has_change
## webhook-payload
true when the captured content differs from the previous capture
## webhook-payload
change_pct
## webhook-payload
Percentage of content that changed (0–100)
## webhook-payload
scrape
## webhook-payload
Full scrape result envelope (same shape as POST /v1/scrape response)
## webhook-payload
Your endpoint must respond with 2xx within 10 seconds. Failed deliveries are retried with
exponential backoff.
## next-steps
API Reference: Monitors — full endpoint schema
## next-steps
Errors — error codes returned by monitor endpoints
---
URL: https://rapidcrawl.dev/docs/guides/rate-limits
# Rate Limits
How Bytekit enforces quotas, concurrency slots, and rate limits.
Bytekit enforces three independent limits: rate limits (requests per second),
concurrency slots (simultaneous in-flight requests), and quota (monthly usage caps).
All three are enforced before any capture work begins.
## quota-model
Quota is pre-incremented on every request and corrected after the request completes:
## quota-model
Pre-increment — your usage counter is incremented by an estimate before the capture
starts. This prevents over-committing capacity even for concurrent requests.
## quota-model
Adjust — once the response returns, the actual content_length (compressed wire bytes)
replaces the estimate for bandwidth accounting.
## quota-model
Compensating decrement — if the request fails with a terminal error, the pre-incremented
amount is rolled back so you are not charged for failed requests.
## quota-model
This means your dashboard usage counter may briefly read higher than actual consumption during
a burst of requests, then settle to the true value.
## concurrency-slots
Each account has a concurrency limit — the maximum number of in-flight requests at one time.
Slots are released automatically a few minutes after a client disconnects, so a crashed client
cannot permanently hold them.
## concurrency-slots
When you exceed your concurrency limit, you receive 429 rate_limited with a Retry-After
header indicating how long to wait before retrying.
## rate-limit-headers
Every response includes headers that show your current position:
## rate-limit-headers
Header
## rate-limit-headers
Description
## rate-limit-headers
X-RateLimit-Limit
## rate-limit-headers
Your account's request-per-second limit
## rate-limit-headers
X-RateLimit-Remaining
## rate-limit-headers
Requests remaining in the current window
## rate-limit-headers
X-RateLimit-Reset
## rate-limit-headers
Unix timestamp when the window resets
## rate-limit-headers
Retry-After
## rate-limit-headers
Seconds to wait before retrying (present on 429 only)
## bandwidth-metering
Bandwidth usage is measured as content_length — the compressed wire bytes returned by the
origin server. If the origin sends gzip or Brotli-encoded content, the compressed size is what
counts, not the decompressed size. This matches what your client actually receives over the
network.
## handling-429-responses
When you receive 429 rate_limited:
## handling-429-responses
Read the Retry-After header value (in seconds).
## handling-429-responses
Wait at least that long before retrying.
## handling-429-responses
If no Retry-After is present, use exponential backoff starting at 1 second.
## next-steps
Errors — full error code reference and retry guidance
## next-steps
Scraping — how quota interacts with the fast and slow paths
---
URL: https://rapidcrawl.dev/docs/guides/scraping
# Scraping
How Bytekit's fast-path and slow-path scrape pipelines work, and when each is used.
POST /v1/scrape routes each request through one of two execution paths depending on the
options you supply. Understanding which path a request takes helps you predict latency and
tune your integration.
## fast-path-500ms
Simple requests go through the fast path: Bytekit fetches the page directly over a proxy
and returns the content, compressed. This is the default for any request that doesn't require a
browser.
## fast-path-500ms
Fast-path responses include X-Scrape-* headers that describe how the request was handled.
## fast-path-500ms
A fast-path request looks like this:
## slow-path-315s
Any of the following options forces the request onto the slow path, which renders the page
in a full browser:
## slow-path-315s
Option
## slow-path-315s
Trigger condition
## slow-path-315s
wait_for_selector
## slow-path-315s
Any non-empty value
## slow-path-315s
wait_until
## slow-path-315s
"networkidle"
## slow-path-315s
cookies
## slow-path-315s
Non-empty array
## slow-path-315s
delay_ms
## slow-path-315s
Greater than 0
## slow-path-315s
headers
## slow-path-315s
Non-empty object
## country-does-not-trigger-the-slow-path
The country option routes the request through a geo-located proxy but does not force a
browser render. It rides the fast path. Only when country is combined with one of the
slow-path triggers listed above does the request render in a browser.
## automatic-browser-retry
When a direct fetch is blocked by bot protection — HTTP 403 or 429 where the site is actively
refusing scrapers — Bytekit automatically retries the request in a full browser that can
clear common challenges. This retry is transparent: the API response shape is identical, and
you don't write any retry logic.
## automatic-browser-retry
Other failure modes (network errors, timeouts, non-403/429 errors) return 502 internal_error
directly without attempting the browser retry.
## response-fields
Field
## response-fields
Description
## response-fields
data.content
## response-fields
Page content in the requested format (markdown by default)
## response-fields
data.content_length
## response-fields
Compressed wire bytes — used for bandwidth billing
## response-fields
data.metadata.statusCode
## response-fields
HTTP status code of the origin page
## response-fields
data.metadata.title
## response-fields
Page
element
## response-fields
The X-Scrape-* response headers describe how each request was handled (which path served it,
cache status, timing, and the final URL) — useful for debugging.
## async-scrapes
For long-running pages, pass "async": true to get a 202 Accepted response with a scrape
id prefixed sc_. Poll the result with GET /v1/scrape/{'{id}'}.
## webhook-event-header
Every outbound webhook POST includes an X-RapidCrawl-Event header with the event type:
## webhook-event-header
Event type
## webhook-event-header
Trigger
## webhook-event-header
bulk.completed
## webhook-event-header
A bulk job finishes (all items processed)
## webhook-event-header
scrape.completed
## webhook-event-header
An async scrape succeeds
## webhook-event-header
scrape.failed
## webhook-event-header
An async scrape fails terminally
## webhook-event-header
monitor.change_detected
## webhook-event-header
A monitor fires and detects a visual change
## webhook-event-header
monitor.captured
## webhook-event-header
A monitor fires with no visual change (notify_on: every)
## webhook-event-header
sitemap.completed
## webhook-event-header
A sitemap job finishes
## webhook-event-header
sitemap.failed
## webhook-event-header
A sitemap job fails terminally
## webhook-event-header
Use this header to dispatch events without inspecting the body shape:
## webhook-event-header
X-RapidCrawl-Event is a reserved header — you cannot supply it via webhook_headers in
monitor or bulk configuration. Attempts to do so will be rejected with a 422 at request
time.
## next-steps
Rate Limits — quota model and concurrency slots
## next-steps
Errors — how to interpret and retry error responses
## next-steps
API Reference: Scrape — full request/response schema
---
URL: https://rapidcrawl.dev/docs/libraries/go
# Go SDK
An official Go client for Bytekit is on the way.
## coming-soon
An official Go SDK is planned. It will offer the same capabilities as the
TypeScript and Python SDKs — scrape,
screenshot, record, bulk, monitors, sitemap, and account management — with idiomatic Go types
and context.Context support.
## in-the-meantime
Bytekit is a plain REST API, so you can call it from Go today with the standard library.
## next-steps
API Reference — every endpoint, request, and response.
## next-steps
TypeScript SDK and Python SDK — available now.
---
URL: https://rapidcrawl.dev/docs/libraries/python
# Python SDK
Official Python client for Bytekit, with sync and async support.
The official Python SDK is generated from the Bytekit OpenAPI spec, so its models and
methods always match the API. Every operation ships in both synchronous and asynchronous form.
## your-first-scrape
Each operation exposes four call styles:
## your-first-scrape
create_scrape.sync(...) — returns the parsed model (or None).
## your-first-scrape
create_scrape.sync_detailed(...) — returns the full Response (status code, headers, body).
## your-first-scrape
create_scrape.asyncio(...) — awaitable form of sync.
## your-first-scrape
create_scrape.asyncio_detailed(...) — awaitable form of sync_detailed.
## whats-available
Operations are grouped by capability under rapidcrawl_python.api.*: scrape, scrape_bulk,
screenshots, recordings, bulk, fetch, fetch_bulk, monitors, sitemap, and
account.
## whats-available
See the Python SDK Reference for every module, function, and model.
## next-steps
TypeScript SDK — the same API surface in TypeScript.
## next-steps
API Reference — the underlying REST endpoints.
## next-steps
Errors — error codes and retry guidance.
---
URL: https://rapidcrawl.dev/docs/libraries/typescript
# TypeScript SDK
Type-safe Bytekit client for Node.js and modern JavaScript runtimes.
The official TypeScript SDK is a thin, type-safe wrapper over the Bytekit REST API. It works
in Node.js and any runtime with a global fetch (Bun, Deno, Cloudflare Workers, modern
browsers).
## initialize
By default the client targets production (https://api.bytekit.com). Point it at staging by
passing baseUrl:
## handle-errors
Any 4xx/5xx response throws a RapidCrawlError carrying the HTTP status and the API
error.code:
## async-scrapes
Pass async: true to enqueue the job and poll by ID:
## whats-available
The client exposes one resource per capability: scrape, screenshots, recordings, bulk,
fetch, monitors, sitemap, and account (each with create / get / list methods as
applicable, plus nested resources like scrape.bulk and monitors.captures).
## whats-available
See the TypeScript SDK Reference for every method and option.
## next-steps
Python SDK — the same API surface in Python.
## next-steps
API Reference — the underlying REST endpoints.
## next-steps
Errors — error codes and retry guidance.
---
URL: https://rapidcrawl.dev/docs/api/account/createapikey
# Create API key
Returns the plaintext key exactly once. Max 50 active keys per account.
Returns the plaintext key exactly once. Max 50 active keys per account.
---
URL: https://rapidcrawl.dev/docs/api/account/getaccount
# Get account details
Returns account, plan, and current API key details. Accepts either auth scheme:
- Bearer API key (`bearerAuth`): returns full account shape including `api_key` and `subscription` fields.
- Clerk session JWT (`ClerkSession`): returns the minimal session account shape (id, name, email, plan, owner_user_id).
Returns account, plan, and current API key details. Accepts either auth scheme:
- Bearer API key (`bearerAuth`): returns full account shape including `api_key` and `subscription` fields.
- Clerk session JWT (`ClerkSession`): returns the minimal session account shape (id, name, email, plan, owner_user_id).
---
URL: https://rapidcrawl.dev/docs/api/account/getapikey
# Get API key
---
URL: https://rapidcrawl.dev/docs/api/account/listapikeys
# List API keys
---
URL: https://rapidcrawl.dev/docs/api/account/revealapikey
# Reveal API key
Returns the plaintext API key value (the full `sk_live_...` / `sk_test_...` token).
Display it once; it will not be returned again from this endpoint without an explicit
re-reveal call.
Any active member of the owning account may reveal a key.
## Error codes
| HTTP | error_code | Meaning |
|------|---------------------|---------|
| 401 | unauthenticated | Missing or invalid Clerk JWT. |
| 403 | forbidden | Caller is not a member of the owning account. |
| 404 | not_found | No API key matches the given ID. |
| 409 | api_key_revoked | The API key has been revoked and cannot be revealed. |
| 409 | reveal_unavailable | This key was created before the reveal feature was enabled. Revoke and recreate to enable reveal. |
| 429 | rate_limited | Per-account reveal rate limit exceeded. |
| 500 | internal_error | |
Returns the plaintext API key value (the full `sk_live_...` / `sk_test_...` token).
Display it once; it will not be returned again from this endpoint without an explicit
re-reveal call.
Any active member of the owning account may reveal a key.
## Error codes
| HTTP | error_code | Meaning |
|------|---------------------|---------|
| 401 | unauthenticated | Missing or invalid Clerk JWT. |
| 403 | forbidden | Caller is not a member of the owning account. |
| 404 | not_found | No API key matches the given ID. |
| 409 | api_key_revoked | The API key has been revoked and cannot be revealed. |
| 409 | reveal_unavailable | This key was created before the reveal feature was enabled. Revoke and recreate to enable reveal. |
| 429 | rate_limited | Per-account reveal rate limit exceeded. |
| 500 | internal_error | |
---
URL: https://rapidcrawl.dev/docs/api/account/revokeapikey
# Revoke API key
Idempotent — returns 204 even if already revoked.
Idempotent — returns 204 even if already revoked.
---
URL: https://rapidcrawl.dev/docs/api/account/updateapikey
# Rename API key
---
URL: https://rapidcrawl.dev/docs/api/bulk/createbulk
# Create mixed bulk job
Enqueues screenshots, recordings, or scrapes in a single batch.
Provide `urls` (simple list) or `items` (per-item config), not both.
Webhook deliveries include `X-RapidCrawl-Event: bulk.completed` so
receivers can dispatch on the header without parsing the body shape.
Enqueues screenshots, recordings, or scrapes in a single batch.
Provide `urls` (simple list) or `items` (per-item config), not both.
Webhook deliveries include `X-RapidCrawl-Event: bulk.completed` so
receivers can dispatch on the header without parsing the body shape.
---
URL: https://rapidcrawl.dev/docs/api/bulk/getbulk
# Get bulk job status
---
URL: https://rapidcrawl.dev/docs/api/bulk/listbulkscreenshots
# List bulk job items
---
URL: https://rapidcrawl.dev/docs/api/fetch/getfetch
# Fetch a URL (GET)
Direct HTTP fetch. The response body IS the content; metadata is returned in X-Fetch-* headers.
Omit `format` for raw passthrough. Use `format=markdown` or `format=html` for conversion.
Direct HTTP fetch. The response body IS the content; metadata is returned in X-Fetch-* headers.
Omit `format` for raw passthrough. Use `format=markdown` or `format=html` for conversion.
---
URL: https://rapidcrawl.dev/docs/api/fetch/postfetch
# Fetch a URL (POST)
Same as GET /v1/fetch but with a JSON request body.
Same as GET /v1/fetch but with a JSON request body.
---
URL: https://rapidcrawl.dev/docs/api/fetch-bulk/createfetchbulk
# Bulk fetch URLs
Enqueues multiple URLs for fast-path HTTP fetching. Results delivered via webhook.
Does not support browser-backed options (cookies, wait_for_selector, etc.).
Enqueues multiple URLs for fast-path HTTP fetching. Results delivered via webhook.
Does not support browser-backed options (cookies, wait_for_selector, etc.).
---
URL: https://rapidcrawl.dev/docs/api/fetch-bulk/getfetchbulk
# Get fetch bulk job status
---
URL: https://rapidcrawl.dev/docs/api/recordings/createrecording
# Create a scrolling recording
Always async — returns 202. Poll via GET /v1/recordings/{id}.
Always async — returns 202. Poll via GET /v1/recordings/{id}.
---
URL: https://rapidcrawl.dev/docs/api/recordings/getrecording
# Get recording status
---
URL: https://rapidcrawl.dev/docs/api/health/healthcheck
# Health check
---
URL: https://rapidcrawl.dev/docs/api/monitors/createmonitor
# Create a change-detection monitor
Creates a recurring change-detection monitor for a URL.
Webhook deliveries include `X-RapidCrawl-Event: monitor.change_detected` or
`X-RapidCrawl-Event: monitor.captured` so receivers can dispatch on the
header without parsing the body shape.
Scrape-monitor webhook payloads carry an additive `was_preflight: boolean`
field. It is `true` only when the cycle confirmed the page was unchanged
(the target returned HTTP 304) — emitted as a heartbeat under
`notify_on='every'`. On the `notify_on='change'` delivery path, unchanged
cycles suppress the webhook entirely. The field is `false` on every other
delivery, including all captures that carry a body.
Creates a recurring change-detection monitor for a URL.
Webhook deliveries include `X-RapidCrawl-Event: monitor.change_detected` or
`X-RapidCrawl-Event: monitor.captured` so receivers can dispatch on the
header without parsing the body shape.
Scrape-monitor webhook payloads carry an additive `was_preflight: boolean`
field. It is `true` only when the cycle confirmed the page was unchanged
(the target returned HTTP 304) — emitted as a heartbeat under
`notify_on='every'`. On the `notify_on='change'` delivery path, unchanged
cycles suppress the webhook entirely. The field is `false` on every other
delivery, including all captures that carry a body.
---
URL: https://rapidcrawl.dev/docs/api/monitors/deletemonitor
# Cancel monitor
---
URL: https://rapidcrawl.dev/docs/api/monitors/getmonitor
# Get monitor
---
URL: https://rapidcrawl.dev/docs/api/monitors/listmonitorcaptures
# List monitor captures
---
URL: https://rapidcrawl.dev/docs/api/monitors/listmonitors
# List monitors
---
URL: https://rapidcrawl.dev/docs/api/monitors/updatemonitor
# Update monitor
---
URL: https://rapidcrawl.dev/docs/api/scrape/createscrape
# Scrape a URL
Fetches and processes a URL, returning content in one or more formats
wrapped in a ScrapeEnvelope. Simple requests use the HTTP fast-path (~500ms);
complex requests (delay_ms > 0, cookies, custom headers, wait conditions)
are routed to headless Chromium and return HTTP 202 (3-15s).
When `webhook_url` is provided and the scrape is async, webhook deliveries
include `X-RapidCrawl-Event: scrape.completed` or `X-RapidCrawl-Event: scrape.failed`
so receivers can dispatch on the header without parsing the body shape.
Fetches and processes a URL, returning content in one or more formats
wrapped in a ScrapeEnvelope. Simple requests use the HTTP fast-path (~500ms);
complex requests (delay_ms > 0, cookies, custom headers, wait conditions)
are routed to headless Chromium and return HTTP 202 (3-15s).
When `webhook_url` is provided and the scrape is async, webhook deliveries
include `X-RapidCrawl-Event: scrape.completed` or `X-RapidCrawl-Event: scrape.failed`
so receivers can dispatch on the header without parsing the body shape.
---
URL: https://rapidcrawl.dev/docs/api/scrape/getscrape
# Get scrape result
Always returns 200. Check the `status` discriminator for the scrape state.
Always returns 200. Check the `status` discriminator for the scrape state.
---
URL: https://rapidcrawl.dev/docs/api/screenshots/createscreenshot
# Capture a screenshot
Sync by default (holds up to 28s). Pass `?async=true` to return 202 immediately.
Poll via GET /v1/screenshots/{id}.
Sync by default (holds up to 28s). Pass `?async=true` to return 202 immediately.
Poll via GET /v1/screenshots/{id}.
---
URL: https://rapidcrawl.dev/docs/api/screenshots/getscreenshot
# Get screenshot status
---
URL: https://rapidcrawl.dev/docs/api/sitemap/createsitemap
# Discover URLs on a domain
Crawls a domain to build a sitemap. Returns cached results if available
within cache_ttl. Results delivered via webhook and downloadable from results_url.
Webhook deliveries include `X-RapidCrawl-Event: sitemap.completed` or
`X-RapidCrawl-Event: sitemap.failed` so receivers can dispatch on the
header without parsing the body shape.
Crawls a domain to build a sitemap. Returns cached results if available
within cache_ttl. Results delivered via webhook and downloadable from results_url.
Webhook deliveries include `X-RapidCrawl-Event: sitemap.completed` or
`X-RapidCrawl-Event: sitemap.failed` so receivers can dispatch on the
header without parsing the body shape.
---
URL: https://rapidcrawl.dev/docs/api/sitemap/getsitemap
# Get sitemap job status
---
URL: https://rapidcrawl.dev/docs/api/scrape-bulk/createscrapebulk
# Bulk scrape URLs
Enqueues multiple URLs for scraping. Results delivered via webhook.
Webhook deliveries include `X-RapidCrawl-Event: scrape.completed` or
`X-RapidCrawl-Event: scrape.failed` so receivers can dispatch on the
header without parsing the body shape.
Enqueues multiple URLs for scraping. Results delivered via webhook.
Webhook deliveries include `X-RapidCrawl-Event: scrape.completed` or
`X-RapidCrawl-Event: scrape.failed` so receivers can dispatch on the
header without parsing the body shape.
---
URL: https://rapidcrawl.dev/docs/api/scrape-bulk/getscrapebulk
# Get scrape bulk job status
---
URL: https://rapidcrawl.dev/docs/sdk/python/account
# Account
API functions for the Account resource.
## account
Functions for interacting with the account API resource. Each operation ships in four forms — sync / sync_detailed and their async counterparts.
## create-api-key
Create API key
## create-api-key
Returns the plaintext key exactly once. Max 50 active keys per account.
## create-api-key
Parameters
## create-api-key
body (CreateApiKeyBody | Unset)
## create-api-key
Call styles
## get-account
Get account details
## get-account
Returns account, plan, and current API key details. Accepts either auth scheme:
## get-account
Bearer API key (bearerAuth): returns full account shape including api_key and subscription
fields.
## get-account
Clerk session JWT (ClerkSession): returns the minimal session account shape (id, name, email,
plan, owner_user_id).
## get-account
Call styles
## get-api-key
Get API key
## get-api-key
Parameters
## get-api-key
id (str)
## get-api-key
Call styles
## list-api-keys
List API keys
## list-api-keys
Parameters
## list-api-keys
include_revoked (bool | Unset): Default: False.
## list-api-keys
Call styles
## reveal-api-key
Reveal API key
## reveal-api-key
Decrypts and returns the plaintext API key. Only available when reveal_available is true.
## reveal-api-key
Parameters
## reveal-api-key
id (str)
## reveal-api-key
Call styles
## revoke-api-key
Revoke API key
## revoke-api-key
Idempotent — returns 204 even if already revoked.
## revoke-api-key
Parameters
## revoke-api-key
id (str)
## revoke-api-key
Call styles
## update-api-key
Rename API key
## update-api-key
Parameters
## update-api-key
id (str)
## update-api-key
body (UpdateApiKeyBody)
## update-api-key
Call styles
---
URL: https://rapidcrawl.dev/docs/sdk/python/bulk
# Bulk
API functions for the Bulk resource.
## bulk
Functions for interacting with the bulk API resource. Each operation ships in four forms — sync / sync_detailed and their async counterparts.
## create-bulk
Create mixed bulk job
## create-bulk
Enqueues screenshots, recordings, or scrapes in a single batch.
Provide urls (simple list) or items (per-item config), not both.
## create-bulk
Parameters
## create-bulk
body (CreateBulkBody)
## create-bulk
Call styles
## get-bulk
Get bulk job status
## get-bulk
Parameters
## get-bulk
id (str)
## get-bulk
Call styles
## list-bulk-screenshots
List bulk job items
## list-bulk-screenshots
Parameters
## list-bulk-screenshots
id (str)
## list-bulk-screenshots
Call styles
---
URL: https://rapidcrawl.dev/docs/sdk/python/client
# Client
HTTP client classes for the Bytekit Python SDK.
## client-reference
The Bytekit Python SDK provides two client classes for making API requests.
## client
A class for keeping track of data related to the API
## client
The following are accepted as keyword arguments and will be used to construct httpx Clients internally:
## client
base_url: The base URL for the API, all requests are made to a relative path to this URL
## client
cookies: A dictionary of cookies to be sent with every request
## client
headers: A dictionary of headers to be sent with every request
## client
timeout: The maximum amount of a time a request can take. API functions will raise
httpx.TimeoutException if this is exceeded.
## client
verify_ssl: Whether or not to verify the SSL certificate of the API server. This should be True in production,
but can be set to False for testing purposes.
## client
follow_redirects: Whether or not to follow redirects. Default value is False.
## client
httpx_args: A dictionary of additional arguments to be passed to the httpx.Client and httpx.AsyncClient constructor.
## client
Attributes:
raise_on_unexpected_status: Whether or not to raise an errors.UnexpectedStatus if the API returns a
status code that was not documented in the source OpenAPI document. Can also be provided as a keyword
argument to the constructor.
## with_headers
Get a new client matching this one with additional headers
## with_cookies
Get a new client matching this one with additional cookies
## with_timeout
Get a new client matching this one with a new timeout configuration
## set_httpx_client
Manually set the underlying httpx.Client
## set_httpx_client
NOTE: This will override any other settings on the client, including cookies, headers, and timeout.
## get_httpx_client
Get the underlying httpx.Client, constructing a new one if not previously set
## set_async_httpx_client
Manually set the underlying httpx.AsyncClient
## set_async_httpx_client
NOTE: This will override any other settings on the client, including cookies, headers, and timeout.
## get_async_httpx_client
Get the underlying httpx.AsyncClient, constructing a new one if not previously set
## authenticatedclient
A Client which has been authenticated for use on secured endpoints
## authenticatedclient
The following are accepted as keyword arguments and will be used to construct httpx Clients internally:
## authenticatedclient
base_url: The base URL for the API, all requests are made to a relative path to this URL
## authenticatedclient
cookies: A dictionary of cookies to be sent with every request
## authenticatedclient
headers: A dictionary of headers to be sent with every request
## authenticatedclient
timeout: The maximum amount of a time a request can take. API functions will raise
httpx.TimeoutException if this is exceeded.
## authenticatedclient
verify_ssl: Whether or not to verify the SSL certificate of the API server. This should be True in production,
but can be set to False for testing purposes.
## authenticatedclient
follow_redirects: Whether or not to follow redirects. Default value is False.
## authenticatedclient
httpx_args: A dictionary of additional arguments to be passed to the httpx.Client and httpx.AsyncClient constructor.
## authenticatedclient
Attributes:
raise_on_unexpected_status: Whether or not to raise an errors.UnexpectedStatus if the API returns a
status code that was not documented in the source OpenAPI document. Can also be provided as a keyword
argument to the constructor.
token: The token to use for authentication
prefix: The prefix to use for the Authorization header
auth_header_name: The name of the Authorization header
## with_headers-1
Get a new client matching this one with additional headers
## with_cookies-1
Get a new client matching this one with additional cookies
## with_timeout-1
Get a new client matching this one with a new timeout configuration
## set_httpx_client-1
Manually set the underlying httpx.Client
## set_httpx_client-1
NOTE: This will override any other settings on the client, including cookies, headers, and timeout.
## get_httpx_client-1
Get the underlying httpx.Client, constructing a new one if not previously set
## set_async_httpx_client-1
Manually set the underlying httpx.AsyncClient
## set_async_httpx_client-1
NOTE: This will override any other settings on the client, including cookies, headers, and timeout.
## get_async_httpx_client-1
Get the underlying httpx.AsyncClient, constructing a new one if not previously set
---
URL: https://rapidcrawl.dev/docs/sdk/python/fetch-bulk
# Fetch Bulk
API functions for the Fetch Bulk resource.
## fetch-bulk
Functions for interacting with the fetch_bulk API resource. Each operation ships in four forms — sync / sync_detailed and their async counterparts.
## create-fetch-bulk
Bulk fetch URLs
## create-fetch-bulk
Enqueues multiple URLs for fast-path HTTP fetching. Results delivered via webhook.
Does not support browser-backed options (cookies, wait_for_selector, etc.).
## create-fetch-bulk
Parameters
## create-fetch-bulk
body (CreateFetchBulkBody)
## create-fetch-bulk
Call styles
## get-fetch-bulk
Get fetch bulk job status
## get-fetch-bulk
Parameters
## get-fetch-bulk
id (str)
## get-fetch-bulk
Call styles
---
URL: https://rapidcrawl.dev/docs/sdk/python/fetch
# Fetch
API functions for the Fetch resource.
## fetch
Functions for interacting with the fetch API resource. Each operation ships in four forms — sync / sync_detailed and their async counterparts.
## get-fetch
Fetch a URL (GET)
## get-fetch
Fast-path HTTP fetch. Response body IS the content; metadata in X-Fetch-* headers.
Omit format for raw passthrough. Use format=markdown or format=html for conversion.
## get-fetch
Parameters
## get-fetch
url_query (str)
## get-fetch
format_ (GetFetchFormat | Unset)
## get-fetch
country (str | Unset): Default: 'US'.
## get-fetch
timeout_ms (int | Unset): Default: 60000.
## get-fetch
cache_ttl (GetFetchCacheTtlType1 | str | Unset): Default: '48h'.
## get-fetch
Call styles
## post-fetch
Fetch a URL (POST)
## post-fetch
Same as GET /v1/fetch but with a JSON request body.
## post-fetch
Parameters
## post-fetch
body (FetchRequest)
## post-fetch
Call styles
---
URL: https://rapidcrawl.dev/docs/sdk/python/monitors
# Monitors
API functions for the Monitors resource.
## monitors
Functions for interacting with the monitors API resource. Each operation ships in four forms — sync / sync_detailed and their async counterparts.
## create-monitor
Create a change-detection monitor
## create-monitor
Parameters
## create-monitor
body (MonitorCreateRequest)
## create-monitor
Call styles
## delete-monitor
Cancel monitor
## delete-monitor
Parameters
## delete-monitor
id (str)
## delete-monitor
Call styles
## get-monitor
Get monitor
## get-monitor
Parameters
## get-monitor
id (str)
## get-monitor
Call styles
## list-monitor-captures
List monitor captures
## list-monitor-captures
Parameters
## list-monitor-captures
id (str)
## list-monitor-captures
limit (int | Unset): Default: 25.
## list-monitor-captures
cursor (str | Unset)
## list-monitor-captures
Call styles
## list-monitors
List monitors
## list-monitors
Parameters
## list-monitors
limit (int | Unset): Default: 25.
## list-monitors
cursor (str | Unset)
## list-monitors
status (ListMonitorsStatus | Unset): Default: ListMonitorsStatus.ACTIVE.
## list-monitors
Call styles
## update-monitor
Update monitor
## update-monitor
Parameters
## update-monitor
id (str)
## update-monitor
body (MonitorUpdateRequest)
## update-monitor
Call styles
---
URL: https://rapidcrawl.dev/docs/sdk/python/recordings
# Recordings
API functions for the Recordings resource.
## recordings
Functions for interacting with the recordings API resource. Each operation ships in four forms — sync / sync_detailed and their async counterparts.
## create-recording
Create a scrolling recording
## create-recording
Always async — returns 202. Poll via GET /v1/recordings/{id}.
## create-recording
Parameters
## create-recording
body (RecordingRequest)
## create-recording
Call styles
## get-recording
Get recording status
## get-recording
Parameters
## get-recording
id (str)
## get-recording
Call styles
---
URL: https://rapidcrawl.dev/docs/sdk/python/scrape-bulk
# Scrape Bulk
API functions for the Scrape Bulk resource.
## scrape-bulk
Functions for interacting with the scrape_bulk API resource. Each operation ships in four forms — sync / sync_detailed and their async counterparts.
## create-scrape-bulk
Bulk scrape URLs
## create-scrape-bulk
Enqueues multiple URLs for scraping. Results delivered via webhook.
## create-scrape-bulk
Parameters
## create-scrape-bulk
body (CreateScrapeBulkBody)
## create-scrape-bulk
Call styles
## get-scrape-bulk
Get scrape bulk job status
## get-scrape-bulk
Parameters
## get-scrape-bulk
id (str)
## get-scrape-bulk
Call styles
---
URL: https://rapidcrawl.dev/docs/sdk/python/scrape
# Scrape
API functions for the Scrape resource.
## scrape
Functions for interacting with the scrape API resource. Each operation ships in four forms — sync / sync_detailed and their async counterparts.
## create-scrape
Scrape a URL
## create-scrape
Fetches and processes a URL, returning content in one or more formats
wrapped in a ScrapeEnvelope. Simple requests use the HTTP fast-path (~500ms);
complex requests (cookies, wait conditions) use headless Chromium (3-15s).
## create-scrape
Parameters
## create-scrape
body (ScrapeRequest)
## create-scrape
Call styles
## get-scrape
Get scrape result
## get-scrape
Always returns 200. Check the status discriminator for the scrape state.
## get-scrape
Parameters
## get-scrape
id (str)
## get-scrape
Call styles
---
URL: https://rapidcrawl.dev/docs/sdk/python/screenshots
# Screenshots
API functions for the Screenshots resource.
## screenshots
Functions for interacting with the screenshots API resource. Each operation ships in four forms — sync / sync_detailed and their async counterparts.
## create-screenshot
Capture a screenshot
## create-screenshot
Sync by default (holds up to 28s). Pass ?async=true to return 202 immediately.
Poll via GET /v1/screenshots/{id}.
## create-screenshot
Parameters
## create-screenshot
async_ (bool | Unset): Default: False.
## create-screenshot
body (ScreenshotRequest)
## create-screenshot
Call styles
## get-screenshot
Get screenshot status
## get-screenshot
Parameters
## get-screenshot
id (str)
## get-screenshot
Call styles
---
URL: https://rapidcrawl.dev/docs/sdk/python/sitemap
# Sitemap
API functions for the Sitemap resource.
## sitemap
Functions for interacting with the sitemap API resource. Each operation ships in four forms — sync / sync_detailed and their async counterparts.
## create-sitemap
Discover URLs on a domain
## create-sitemap
Crawls a domain to build a sitemap. Returns cached results if available
within cache_ttl. Results delivered via webhook and downloadable from results_url.
## create-sitemap
Parameters
## create-sitemap
body (SitemapRequest)
## create-sitemap
Call styles
## get-sitemap
Get sitemap job status
## get-sitemap
Parameters
## get-sitemap
id (str)
## get-sitemap
Call styles
---
URL: https://rapidcrawl.dev/docs/sdk/typescript/account
# Account
Account info and API key management.
## account
Account info and API key management. Accessed via client.account.
## get
GET /v1/account — get the authenticated account
## create
POST /v1/account/api-keys — create a new API key
## list
GET /v1/account/api-keys — list all API keys for the account
## get-1
GET /v1/account/api-keys/{id} — get an API key by ID
## update
PATCH /v1/account/api-keys/{id} — update an API key
## revoke
DELETE /v1/account/api-keys/{id} — revoke an API key (idempotent)
## reveal
POST /v1/account/api-keys/{id}/reveal — reveal the plaintext API key value
---
URL: https://rapidcrawl.dev/docs/sdk/typescript/bulk
# Bulk
Fan out many URLs in parallel with per-item webhooks.
## bulk
Fan out many URLs in parallel with per-item webhooks. Accessed via client.bulk.
## create
POST /v1/bulk — create a bulk screenshots job
## get
GET /v1/bulk/{id} — get a bulk job status
## list
GET /v1/bulk/{id}/screenshots — list screenshots in a bulk job
---
URL: https://rapidcrawl.dev/docs/sdk/typescript/client
# Client
Instantiate and configure the Bytekit TypeScript client.
## client
The Bytekit TypeScript SDK is a thin, type-safe wrapper over the REST API. Create a
client with your API key, then call resource methods.
## options-rapidcrawloptions
apiKey (required) — string
## options-rapidcrawloptions
baseUrl — string Defaults to https://api.bytekit.com.
## resources
The client exposes one property per resource:
## resources
client.scrape — Fetch a URL as raw HTML, clean markdown, or structured content.
## resources
client.screenshots — Capture full-page or viewport screenshots.
## resources
client.recordings — Generate a scrolling video of a page.
## resources
client.bulk — Fan out many URLs in parallel with per-item webhooks.
## resources
client.fetch — Low-latency raw HTTP fetch with optional conversion.
## resources
client.monitors — Watch a URL on a schedule and webhook on change.
## resources
client.sitemap — Discover URLs from a domain's sitemap or by crawling.
## resources
client.account — Account info and API key management.
## errors
Any non-2xx response throws a RapidCrawlError carrying the HTTP status and the API code.
---
URL: https://rapidcrawl.dev/docs/sdk/typescript/fetch
# Fetch
Low-latency raw HTTP fetch with optional conversion.
## fetch
Low-latency raw HTTP fetch with optional conversion. Accessed via client.fetch.
## create
POST /v1/fetch — fetch URL content via POST body.
## create
Returns content directly (not ID-based).
## get
GET /v1/fetch — fetch URL content via query params.
## get
Returns content directly (not ID-based).
## create-1
POST /v1/fetch/bulk — create a bulk fetch job
## get-1
GET /v1/fetch/bulk/{id} — get a bulk fetch job
## options-fetchopts
url (required) — string
## options-fetchopts
format — string
## options-fetchopts
country — string
## options-fetchopts
timeoutMs — number
## options-fetchopts
cacheTtl — number
## options-fetchopts
custom — Record — User-supplied JSON payload, base64-encoded into the X-Fetch-Custom
response header so callers can correlate the response to caller-side
state. Capped at 4096 UTF-8 bytes after JSON serialization. Does NOT
affect cache-key inputs.
---
URL: https://rapidcrawl.dev/docs/sdk/typescript/monitors
# Monitors
Watch a URL on a schedule and webhook on change.
## monitors
Watch a URL on a schedule and webhook on change. Accessed via client.monitors.
## create
POST /v1/monitors — create a visual-change monitor
## list
GET /v1/monitors — list all monitors for the account
## get
GET /v1/monitors/{id} — get a monitor by ID
## update
PATCH /v1/monitors/{id} — update a monitor
## delete
DELETE /v1/monitors/{id} — delete a monitor
## list-1
GET /v1/monitors/{id}/captures — list captures for a monitor
---
URL: https://rapidcrawl.dev/docs/sdk/typescript/recordings
# Recordings
Generate a scrolling video of a page.
## recordings
Generate a scrolling video of a page. Accessed via client.recordings.
## create
POST /v1/recordings — start a recording
## get
GET /v1/recordings/{id} — poll a recording by ID
---
URL: https://rapidcrawl.dev/docs/sdk/typescript/scrape
# Scrape
Fetch a URL as raw HTML, clean markdown, or structured content.
## scrape
Fetch a URL as raw HTML, clean markdown, or structured content. Accessed via client.scrape.
## create
POST /v1/scrape — create a scrape job
## get
GET /v1/scrape/{id} — poll a scrape by ID
## create-1
POST /v1/scrape/bulk — create a bulk scrape job
## get-1
GET /v1/scrape/bulk/{id} — get a bulk scrape job
## options-scrapeopts
url (required) — string
## options-scrapeopts
formats — Array<'rawHtml' | 'html' | 'markdown' | 'links' | 'images'>
## options-scrapeopts
country — string
## options-scrapeopts
cookies — Array>
## options-scrapeopts
headers — Record
## options-scrapeopts
delay_ms — number
## options-scrapeopts
timeout_ms — number
## options-scrapeopts
async — boolean
## options-scrapeopts
webhook_url — string
## options-scrapeopts
events — Array<'queued' | 'completed' | 'failed'>
## options-scrapeopts
markdownMode — MarkdownMode — Markdown processing mode. article=article extraction (default), raw=minimal cleanup, llm=compact LLM-optimised output.
## options-scrapeopts
markdownQuery — string — BM25 query string for relevance-ranked content filtering. Omit or leave empty to disable.
## options-scrapeopts
markdownLinks — MarkdownLinks — Link rendering style in the markdown output.
## options-scrapeopts
markdownCompact — boolean — Collapse excessive whitespace for a more compact output.
## options-scrapeopts
markdownFilterImages — boolean — Filter low-signal images from the markdown output.
## options-scrapeopts
markdownIncludeMedia — boolean — When true, formats.links and formats.images return ScrapeScoredLink[] / ScrapeScoredImage[] (rich objects) instead of string[], and a top-level tables array is included. Only effective when markdown is in formats.
## options-scrapeopts
markdownIncludeWarnings — boolean — When true, the response includes a top-level warnings array of ScrapeWarning objects. Only effective when markdown is in formats.
## options-scrapeopts
markdownIncludeStats — boolean — When true, the response includes a top-level stats object with ScrapeStats (chars, tokens, blocks). Only effective when markdown is in formats.
## options-scrapeopts
cache_ttl — string | 0 — How long a freshly fetched URL may be served from cache. '0'/0 disables
cache, 'Nh'/'Nd' set a TTL (capped at 168h / 7d). Default '48h'.
Honoured on the synchronous path only — the async path accepts the value
but does not currently act on it.
## options-scrapeopts
custom — Record — User-supplied JSON payload, echoed back on the success envelope so callers
can correlate the response to caller-side state (job IDs, batch metadata).
Capped at 4096 UTF-8 bytes after JSON serialization. Does NOT affect
cache-key inputs — two requests differing only in custom share the same
cache slot.
---
URL: https://rapidcrawl.dev/docs/sdk/typescript/screenshots
# Screenshots
Capture full-page or viewport screenshots.
## screenshots
Capture full-page or viewport screenshots. Accessed via client.screenshots.
## create
POST /v1/screenshots — capture a screenshot
## get
GET /v1/screenshots/{id} — poll a screenshot by ID
---
URL: https://rapidcrawl.dev/docs/sdk/typescript/sitemap
# Sitemap
Discover URLs from a domain's sitemap or by crawling.
## sitemap
Discover URLs from a domain's sitemap or by crawling. Accessed via client.sitemap.
## create
POST /v1/sitemap — start a sitemap crawl
## get
GET /v1/sitemap/{id} — get a sitemap job by ID