Skip to main content
POST
/
v1
/
crawl
cURL
curl -X POST https://api.getcatalog.ai/v1/crawl \
  -H "Content-Type: application/json" \
  -H "x-api-key: $CATALOG_API_KEY" \
  -d '{
    "url": "skims.com"
  }'
{
  "success": true,
  "execution_id": "crawl-skims-com-9b68e2c1"
}

Documentation Index

Fetch the complete documentation index at: https://docs.getcatalog.ai/llms.txt

Use this file to discover all available pages before exploring further.

When to use: Discover and index all product listings from a vendor website. The crawl process will:
  1. Discover collections from the vendor
  2. Extract product listings from each collection
This is ideal for initial discovery of a new vendor. Use POST /v1/products to get full product details from discovered listings.
Async Processing: This endpoint starts processing asynchronously and returns an execution_id immediately. Use GET /v1/crawl/{execution_id} to check progress and retrieve results when processing completes.Billing Requirement: Crawl requests require auto top-up to be enabled in your billing settings. This ensures you have sufficient credits to complete the crawl operation.View All Executions: You can view all your crawl executions using GET /v1/crawl to see all crawl jobs, their statuses, and when they were created.Cancel Execution: You can cancel a running crawl execution using DELETE /v1/crawl/{execution_id} if you need to stop a crawl that is currently processing.

Request

x-api-key
string
required
Your API key for authentication

Request Body

url
string
required
The vendor website URL or domain to crawlNote: The URL will be normalized to a canonical form. If a crawl is already running for the same hostname, the existing execution ID will be returned.URL sanitization: URLs are sanitized and normalized on the backend; client-side sanitization is not required.

Response

success
boolean
Whether the crawl request was successfully started
execution_id
string
Unique execution identifier for this crawl job. Use this ID with GET /v1/crawl/{execution_id} to check progress.Format: crawl-{hostname}-{uuid}Note: If a crawl is already running for the same hostname, this will return the existing execution ID.
cURL
curl -X POST https://api.getcatalog.ai/v1/crawl \
  -H "Content-Type: application/json" \
  -H "x-api-key: $CATALOG_API_KEY" \
  -d '{
    "url": "skims.com"
  }'
{
  "success": true,
  "execution_id": "crawl-skims-com-9b68e2c1"
}