Skip to main content
GET
https://api.getcatalog.ai
/
v2
/
crawl
/
{execution_id}
# Check execution status
curl -X GET "https://api.getcatalog.ai/v2/crawl/crawl-skims-com-a1b2c3d4" \
  -H "x-api-key: $CATALOG_API_KEY"
{
  "execution_id": "crawl-skims-com-a1b2c3d4",
  "status": "running",
  "error": null,
  "meta": {
    "vendor": "skims.com",
    "start_date": "2025-01-15T10:30:00.000Z",
    "stop_date": null,
    "duration_ms": 45000,
    "progress": {
      "collections_found": 12,
      "listings_found": 456
    },
    "steps": [
      {
        "name": "Determining crawl method",
        "status": "completed"
      },
      {
        "name": "Discovering collections",
        "status": "running"
      },
      {
        "name": "Analyzing collections",
        "status": "pending"
      },
      {
        "name": "Extracting listings",
        "status": "pending"
      }
    ]
  }
}
When to use: After starting a crawl job with the POST endpoint, use this endpoint to:
  • Check if the crawl is complete
  • Monitor real-time progress with step-by-step status
  • Get detailed metrics including collections and listings found
  • Track execution timing and duration
Cancel Running Executions: If you need to stop a crawl that is currently running, use DELETE /v2/crawl/{execution_id} to stop the execution.

Request

x-api-key
string
required
Your API key for authentication
execution_id
string
required
The execution ID returned from POST /v2/crawlFormat: crawl-{hostname}-{uuid}Note: If the execution ID does not exist or does not belong to your organization, the endpoint returns a 404 Not Found error.

Response

execution_id
string
The execution identifier
status
string
Current execution statusPossible values:
  • "pending" - Execution has been created but not yet started
  • "running" - Execution is currently processing
  • "completed" - Execution finished successfully
  • "failed" - Execution failed or was aborted
error
string | null
Error message if the execution failed, otherwise null
meta
object
Detailed metadata about the execution
# Check execution status
curl -X GET "https://api.getcatalog.ai/v2/crawl/crawl-skims-com-a1b2c3d4" \
  -H "x-api-key: $CATALOG_API_KEY"
{
  "execution_id": "crawl-skims-com-a1b2c3d4",
  "status": "running",
  "error": null,
  "meta": {
    "vendor": "skims.com",
    "start_date": "2025-01-15T10:30:00.000Z",
    "stop_date": null,
    "duration_ms": 45000,
    "progress": {
      "collections_found": 12,
      "listings_found": 456
    },
    "steps": [
      {
        "name": "Determining crawl method",
        "status": "completed"
      },
      {
        "name": "Discovering collections",
        "status": "running"
      },
      {
        "name": "Analyzing collections",
        "status": "pending"
      },
      {
        "name": "Extracting listings",
        "status": "pending"
      }
    ]
  }
}

Polling Strategy

For best results when waiting for completion:
  1. Initial Poll: Check status immediately after receiving execution_id
  2. Polling Interval: Wait 10-30 seconds between polls for running executions (crawls can take longer than product processing)
  3. Exponential Backoff: Consider increasing wait time for long-running crawls
  4. Timeout: Set a maximum wait time based on the size of the vendor website
  5. Step Monitoring: Use the steps array in the response to see which phase of the crawl is currently executing
  6. Progress Tracking: Monitor meta.progress to see real-time counts of collections and listings discovered
  7. Cancellation: If a crawl is taking too long or you need to stop it, use DELETE /v2/crawl/{execution_id} to stop running executions