Skip to main content
POST
/
v2
/
crawl
cURL
curl -X POST https://api.getcatalog.ai/v2/crawl \
  -H "Content-Type: application/json" \
  -H "x-api-key: $CATALOG_API_KEY" \
  -d '{
    "url": "skims.com"
  }'
{
  "execution_id": "crawl-skims-com-4ddfa1c9",
  "status": "pending",
  "meta": {
    "credits_used": 0
  }
}
When to use: Discover and index all product listings from a vendor website. The crawl process will:
  1. Discover collections from the vendor
  2. Extract product listings from each collection
This is ideal for initial discovery of a new vendor. Use POST /v2/extract to get full product details from discovered listings.
Async Processing: This endpoint starts processing asynchronously and returns an execution_id immediately. Use GET /v2/crawl/{execution_id} to check progress and retrieve results when processing completes.Billing Requirement: Crawl requests require auto top-up to be enabled in your billing settings. This ensures you have sufficient credits to complete the crawl operation.View All Executions: You can view all your crawl executions using GET /v2/crawl to see all crawl jobs, their statuses, and when they were created.Cancel Execution: You can cancel a running crawl execution using DELETE /v2/crawl/{execution_id} if you need to stop a crawl that is currently processing.

Request

x-api-key
string
required
Your API key for authentication

Request Body

url
string
required
The vendor website URL or domain to crawlNote: The URL will be normalized to a canonical form. If a crawl is already running for the same hostname, the existing execution ID will be returned.URL sanitization: URLs are sanitized and normalized on the backend; client-side sanitization is not required.

Response

execution_id
string
Unique execution identifier for this crawl job. Use this ID with GET /v2/crawl/{execution_id} to check progress.Format: crawl-{hostname}-{uuid}Note: If a crawl is already running for the same hostname, this will return the existing execution ID.
status
string
Initial execution status. Always "pending" when the crawl is first started.
meta
object
Metadata about the request
HTTP Status Code: This endpoint returns 202 Accepted to indicate the request has been accepted for processing. The response body contains the execution details you need to track the crawl progress.
cURL
curl -X POST https://api.getcatalog.ai/v2/crawl \
  -H "Content-Type: application/json" \
  -H "x-api-key: $CATALOG_API_KEY" \
  -d '{
    "url": "skims.com"
  }'
{
  "execution_id": "crawl-skims-com-4ddfa1c9",
  "status": "pending",
  "meta": {
    "credits_used": 0
  }
}