When to use: Discover and index all product listings from a vendor website. The crawl process will:
Discover collections from the vendor
Extract product listings from each collection
This is ideal for initial discovery of a new vendor. Use POST /v2/extract to get full product details from discovered listings.
Async Processing: This endpoint starts processing asynchronously and returns an execution_id immediately. Use GET /v2/crawl/{execution_id} to check progress and retrieve results when processing completes.Billing Requirement: Crawl requests require auto top-up to be enabled in your billing settings. This ensures you have sufficient credits to complete the crawl operation.View All Executions: You can view all your crawl executions using GET /v2/crawl to see all crawl jobs, their statuses, and when they were created.Cancel Execution: You can cancel a running crawl execution using DELETE /v2/crawl/{execution_id} if you need to stop a crawl that is currently processing.
The vendor website URL or domain to crawlNote: The URL will be normalized to a canonical form. If a crawl is already running for the same hostname, the existing execution ID will be returned.URL sanitization: URLs are sanitized and normalized on the backend; client-side sanitization is not required.
Unique execution identifier for this crawl job. Use this ID with GET /v2/crawl/{execution_id} to check progress.Format:crawl-{hostname}-{uuid}Note: If a crawl is already running for the same hostname, this will return the existing execution ID.
Unique request identifier for support purposes (also available in the Request-ID response header)
HTTP Status Code: This endpoint returns 202 Accepted to indicate the request has been accepted for processing. The response body contains the execution details you need to track the crawl progress.