Migrating from v1 to v2 - Catalog Documentation

This guide covers all changes between Catalog API v1 and v2 to help you successfully migrate your integration. Each endpoint group has been reviewed and updated with improved consistency, better error handling, and enhanced features.

Overview of Changes

The main improvements in v2 include:

Consistent Response Format: All endpoints now follow a unified response structure with data, pagination, and meta fields
Enhanced Extract Endpoints: New vendor-based extraction mode and improved URL extraction
Better Execution Management: More detailed execution status tracking with progress, timing, and step information
Improved Error Handling: Enhanced error responses with additional context fields
HTTP Status Codes: Better adherence to REST standards (e.g., 202 Accepted for async operations)

Discover Endpoints

The Discover endpoints have undergone moderate changes, primarily in response format consistency and enhanced status tracking.

Major Changes

1. Crawl Endpoint

v1:

POST /v1/crawl

v2:

POST /v2/crawl

Response Format: v1:

{
  "success": true,
  "execution_id": "crawl-skims-com-9b68e2c1"
}

v2:

{
  "execution_id": "crawl-skims-com-4ddfa1c9",
  "status": "pending",
  "meta": {
    "credits_used": 0
  }
}

Key Changes:

Removed success field
Added status field (always "pending" in POST response)
Added meta object with credits_used
HTTP status code changed to 202 Accepted (was 200 OK)

2. Get Crawl Status Endpoint

v1 Response:

{
  "status": "completed",
  "total_listings_found": 1234
}

v2 Response:

{
  "execution_id": "crawl-example-com-a1b2c3d4",
  "status": "completed",
  "error": null,
  "meta": {
    "vendor": "example.com",
    "start_date": "2025-01-15T10:30:00.000Z",
    "stop_date": "2025-01-15T10:45:30.000Z",
    "duration_ms": 930000,
    "progress": {
      "collections_found": 12,
      "listings_found": 456
    },
    "steps": [
      {
        "name": "Determining crawl method",
        "status": "completed"
      },
      {
        "name": "Discovering collections",
        "status": "completed"
      },
      {
        "name": "Analyzing collections",
        "status": "completed"
      },
      {
        "name": "Extracting listings",
        "status": "completed"
      }
    ],
    "result": {
      "collections_total": 25,
      "listings_total": 1234
    }
  }
}

Key Changes:

total_listings_found moved to meta.result.listings_total
Added execution_id in response
Added error field (always present, null if no error)
Added comprehensive meta object with:
- vendor: The vendor hostname
- start_date, stop_date, duration_ms: Timing information
- progress: Real-time progress (only when running)
- steps: Array of step statuses showing crawl phase progress
- result: Final statistics (collections_total, listings_total)

New Features:

Step-by-step progress tracking via meta.steps
Real-time progress via meta.progress when status is "running"
Timing information for performance monitoring

3. List Crawl Executions Endpoint

v1:

GET /v1/crawl?status=completed&limit=20

v2:

GET /v2/crawl?status=completed&page_size=20

Query Parameter Changes:

limit parameter renamed to page_size in v2

Response Format: v1:

{
  "executions": [
    {
      "execution_id": "crawl-example-com-...",
      "status": "completed",
      "created_at": "2025-01-15T10:30:00.000Z"
    }
  ]
}

v2:

{
  "data": [
    {
      "execution_id": "crawl-example-com-...",
      "status": "completed",
      "created_at": "2025-01-15T10:30:00.000Z"
    }
  ],
  "pagination": {
    "page": 1,
    "page_size": 20,
    "total_items": 10,
    "total_pages": 1,
    "has_next": false,
    "has_prev": false
  },
  "meta": {}
}

Key Changes:

Response field changed from executions to data
Added full pagination object
Added meta object

4. Get Products (Listings) Endpoint

v1:

POST /v1/listings

v2:

POST /v2/listings

Response Format: v1:

{
  "listings": [{ "_truncated": "Additional items omitted for display" }],
  "meta": {
    "total_items": 1290,
    "total_pages": 645,
    "current_page": 1,
    "page_size": 2,
    "has_next": true,
    "has_prev": false
  }
}

v2:

{
  "data": [{ "_truncated": "Additional items omitted for display" }],
  "pagination": {
    "page": 1,
    "page_size": 2,
    "total_items": 1616,
    "total_pages": 808,
    "has_next": true,
    "has_prev": false
  },
  "meta": {}
}

Key Changes:

Response field changed from listings to data
meta pagination fields moved to pagination object
current_page renamed to page in pagination
meta object is now separate and empty for this endpoint

5. Get Collections Endpoint

v1:

POST /v1/collections

v2:

POST /v2/collections

Response Format: v1:

{
  "collections": [{ "_truncated": "Additional items omitted for display" }],
  "meta": {
    "total_items": 50,
    "total_pages": 5,
    "current_page": 1,
    "page_size": 10,
    "has_next": true,
    "has_prev": false
  }
}

v2:

{
  "data": [{ "_truncated": "Additional items omitted for display" }],
  "pagination": {
    "page": 1,
    "page_size": 20,
    "total_items": 50,
    "total_pages": 3,
    "has_next": true,
    "has_prev": false
  },
  "meta": {}
}

Key Changes:

Response field changed from collections to data
meta pagination fields moved to pagination object
current_page renamed to page in pagination
Default page_size changed from 10 to 20 in v2
meta object is now separate and empty for this endpoint

6. Get Vendors Endpoint

No Changes: The Get Vendors endpoint response format remains identical between v1 and v2. Only the endpoint path changes from /v1/vendors to /v2/vendors.

Migration Steps for Discover Endpoints

Update response parsing: Change listings to data, pagination fields to pagination object
Update response parsing: Change collections to data, pagination to pagination object
Update query parameters: Change limit to page_size in list endpoints
Update crawl status handling: Access meta.result.listings_total instead of total_listings_found
Utilize new features: Consider using meta.steps and meta.progress for better crawl monitoring
Update POST response handling: Remove success field checks, use HTTP status codes instead
Note default change: Get Collections default page_size changed from 10 to 20

Extract Endpoints

The Extract endpoints have undergone significant changes in v2. Both /v1/products (URLs) and /v1/extract (vendor-based) endpoints have been unified into a single /v2/extract endpoint that offers two distinct modes of operation.

Major Changes

1. Endpoint Path Changed

v1:

POST /v1/products (URLs)
GET /v1/products (List executions)
GET /v1/products/{execution_id} (Get status)
DELETE /v1/products/{execution_id} (Cancel)

POST /v1/extract (Vendor-based)
GET /v1/extract (List executions)
GET /v1/extract/{execution_id} (Get status)
DELETE /v1/extract/{execution_id} (Cancel)

v2:

POST /v2/extract (with urls or vendor parameter)
GET /v2/extract (List executions)
GET /v2/extract/{execution_id} (Get status)
DELETE /v2/extract/{execution_id} (Cancel)

Note: In v2, both URLs and vendor-based extraction use the same endpoint (/v2/extract). The input source is determined by the request body: use urls for product URLs, or vendor for extracting from a vendor.

2. Unified Extraction Endpoint

v2 unifies both v1 extraction endpoints (/v1/products and /v1/extract) into a single /v2/extract endpoint with two input source options: Using urls parameter (Migrated from v1 /v1/products):

Use when you have specific product URLs
Post to /v2/extract with urls array in the request body
Works the same way as v1 /v1/products but with updated response format
Execution ID format: extract-urls-{uuid}

Using vendor parameter (Migrated from v1 /v1/extract):

Use when you want to extract all products from a vendor
Post to /v2/extract with vendor field in the request body
Requires prior crawl or can wait for a crawl using crawl_id parameter
Useful for bulk extraction from discovered listings
Execution ID format: extract-{vendor}-{uuid}

3. Request Body Changes

v1 Request (POST /v1/products):

{
  "urls": ["https://example.com/product"],
  "enable_enrichment": true,
  "country_code": "us",
  "enable_reviews": false,
  "enable_image_tags": false
}

v2 Request - Using urls (POST /v2/extract):

{
  "urls": ["https://example.com/product"],
  "enable_enrichment": false,  // Default changed from true to false
  "country_code": "us",
  "enable_reviews": false,
  "enable_image_tags": false,
  // New optional parameters:
  "enable_brand_pdp": false,
  "enable_similar_products": false,
  "enable_reddit_insights": false
}

v1 Request (POST /v1/extract):

{
  "domain": "nike.com",
  "max_products": 100,
  "enable_enrichment": true,
  "country_code": "us",
  "enable_reviews": true,
  "enable_image_tags": true,
  "crawl_id": "crawl-nike-com-7f42-402f"
}

v2 Request - Using vendor (POST /v2/extract):

{
  "vendor": "nike.com",  // Changed from "domain" in v1
  "max_products": 100,  // Optional: omit for no limit
  "crawl_id": "crawl-nike-com-7f42-402f",  // Optional: wait for crawl
  "enable_enrichment": false,  // Default changed from true to false
  "country_code": "us",
  "enable_reviews": false,  // Default changed from true to false
  "enable_image_tags": false,  // Default changed from true to false
  // New optional parameters:
  "enable_brand_pdp": false,
  "enable_similar_products": false,
  "enable_reddit_insights": false
}

Key Changes:

Parameter name changed: domain → vendor in v2
enable_enrichment default changed from true to false in v2
enable_reviews default changed from true to false in v2
enable_image_tags default changed from true to false in v2
New optional parameters: enable_brand_pdp, enable_similar_products, enable_reddit_insights

4. Response Format Changes

v1 POST Response:

{
  "success": true,
  "execution_id": "products-batch-51d87084-7f42-402f-a5af-6ff512c82cd0"
}

v2 POST Response:

{
  "execution_id": "extract-urls-965b1912-6af0-4ed8-b7e3-184b85e788b7",
  "status": "pending",
  "meta": {
    "credits_used": 10,
    "url_count": 2
  }
}

Key Changes:

Removed success field (HTTP status code indicates success)
Added status field (always "pending" in POST response)
Added meta object with credits_used and url_count (or listings_count, products_to_process for vendor-based)
HTTP status code changed to 202 Accepted (was 200 OK)
Execution ID format changed from products-batch-{uuid} to extract-urls-{uuid} or extract-{vendor}-{uuid}

5. Execution ID Format Changes

v1:

Format: products-batch-{uuid}

v2:

When using urls: extract-urls-{uuid}
When using vendor: extract-{vendor}-{uuid} (dots in vendor replaced with dashes)

6. List Executions Endpoint Changes

v1:

GET /v1/products?status=completed&page_size=20

v2:

GET /v2/extract?status=processing&page_size=20

Response Format:

Response format is consistent between v1 and v2
Status values: v1 uses "processing", v2 distinguishes "pending" and "running" (both included when filtering by "processing")

7. Get Execution Status Changes

v1:

GET /v1/products/{execution_id}?page=1&limit=50

Note: v1 accepts both limit and page_size query parameters for backward compatibility. v2:

GET /v2/extract/{execution_id}?page=1&page_size=50

Query Parameter Changes:

limit parameter renamed to page_size in v2 (v1 accepts both)
Default page_size is 50 (same as v1’s default limit)

Response Format: v1:

{
  "status": "completed",
  "progress": {
    "products_processed": 45,
    "urls_completed": 45,
    "total_urls": 100,
    "percent_complete": 45
  },
  "error": null,
  "results": {
    "products": [{ "_truncated": "Additional items omitted for display" }],
    "meta": {
      "total_requested": 100,
      "total_successful": 98,
      "total_failed": 2
    }
  },
  "pagination": {
    "page": 1,
    "limit": 50,
    "total_items": 100,
    "total_pages": 2,
    "has_next": true
  }
}

v2:

{
  "execution_id": "extract-urls-965b1912-6af0-4ed8-b7e3-184b85e788b7",
  "status": "completed",
  "error": null,
  "data": [{ "_truncated": "Additional items omitted for display" }],
  "pagination": {
    "page": 1,
    "page_size": 50,
    "total_items": 100,
    "total_pages": 2,
    "has_next": true,
    "has_prev": false
  },
  "meta": {
    "progress": {
      "products_processed": 98,
      "products_total": 100,
      "percent_complete": 100
    },
    "start_date": "2025-01-15T10:30:00.000Z",
    "stop_date": "2025-01-15T10:35:22.000Z",
    "duration_ms": 322000,
    "result": {
      "products_requested": 100,
      "products_successful": 98,
      "products_failed": 2
    },
    "waiting_for": "crawl-nike-com-7f42-402f"  // Only present when waiting for crawl
  }
}

Key Changes:

results.products moved to top-level data array
results.meta moved to meta.result
progress moved to meta.progress
Progress field names changed: urls_completed/total_urls → products_processed/products_total
Added execution_id in response
Added timing fields: meta.start_date, meta.stop_date, meta.duration_ms
Added meta.waiting_for for vendor-based extractions waiting for crawl
pagination.limit renamed to pagination.page_size
Added pagination.has_prev field
Status values: v2 distinguishes "pending" (waiting) and "running" (actively processing)

8. Cancel Execution Changes

v1:

DELETE /v1/products/{execution_id}

v2:

DELETE /v2/extract/{execution_id}

Response Format: (No changes, but error handling improved) v2 Additional Error Cases:

Cannot cancel executions waiting for crawl (status "pending" with waiting_for)
Better error messages for different failure scenarios

Migration Steps for Extract Endpoints

Update endpoint paths:
- Change /v1/products to /v2/extract (for URLs)
- Change /v1/extract to /v2/extract (for vendor-based extraction)
Update request parameters:
- For vendor-based: Change domain parameter to vendor in request body
- Set enable_enrichment, enable_reviews, and enable_image_tags explicitly if you were relying on v1 defaults
Update execution ID handling:
- URLs: Change format from products-batch-{uuid} to extract-urls-{uuid}
- Vendor-based: Execution ID format remains extract-{vendor}-{uuid} (unchanged)
Update response parsing: Change results.products to data, results.meta to meta.result
Update query parameters: Change limit to page_size in status endpoint
Unified endpoint: Both extraction modes now use the same /v2/extract endpoint - choose mode via request body (urls for URLs, vendor for vendor-based)

Search Endpoints

The Search endpoints have undergone significant changes in v2. The most important change is that Agentic Search is now asynchronous in v2 (it was synchronous in v1).

Major Changes

1. Agentic Search Endpoint - Now Asynchronous

v1 (Synchronous):

POST /v1/agentic-search

Returns products immediately in the response
HTTP status: 200 OK
Blocks until search completes

v2 (Asynchronous):

POST /v2/agentic-search

Returns an execution_id immediately
HTTP status: 202 Accepted
Search processes asynchronously in the background
Use the status endpoint to check progress and retrieve results

v1 POST Response (Synchronous):

{
  "products": [{ "_truncated": "Additional items omitted for display" }],
  "meta": {
    "totalItems": 100,
    "urls": ["_truncated"]
  }
}

v2 POST Response (Asynchronous):

{
  "execution_id": "agentic-a1b2c3d4-1735123456789",
  "status": "pending",
  "meta": {
    "credits_used": 25
  }
}

Key Changes:

v2 is asynchronous: POST endpoint returns execution_id instead of products
HTTP status code: Changed from 200 OK to 202 Accepted
Response structure: No products in POST response; use status endpoint to retrieve results
Execution management: New endpoints for checking status, listing executions, and canceling

2. New v2 Endpoints for Agentic Search

v2 introduces new endpoints to manage asynchronous search executions: Get Execution Status:

GET /v2/agentic-search/{execution_id}

Check execution progress and retrieve results when complete
Returns products in data array when status is "completed"
Includes progress tracking via meta.progress
Supports pagination for results

List Executions:

GET /v2/agentic-search?status=completed&page_size=20

View all your search executions
Filter by status (processing, completed, failed)
See execution IDs and creation timestamps

Cancel Execution:

DELETE /v2/agentic-search/{execution_id}

Cancel a running search execution
Only works for executions with status "pending" or "running"

3. Get Execution Status Response Format

When checking status in v2, the response format when status is "completed":

{
  "execution_id": "agentic-a1b2c3d4-1735123456789",
  "status": "completed",
  "error": null,
  "data": [{ "_truncated": "Additional items omitted for display" }],
  "pagination": {
    "page": 1,
    "page_size": 50,
    "total_items": 8,
    "total_pages": 1,
    "has_next": false,
    "has_prev": false
  },
  "meta": {
    "query": "A Barbour jacket suitable for the office",
    "progress": {
      "products_processed": 8,
      "products_total": 8,
      "percent_complete": 100
    },
    "start_date": "2025-01-15T10:30:00.000Z",
    "stop_date": "2025-01-15T10:32:30.000Z",
    "duration_ms": 150000,
    "result": {
      "products_requested": 8,
      "products_successful": 8,
      "products_failed": 0
    }
  }
}

Key Differences from v1:

Products are in data array (not products)
Added pagination object
Added execution_id in response
Added meta.progress for real-time progress tracking
Added timing information (start_date, stop_date, duration_ms)
Added meta.result with execution statistics

4. Agentic Search Mini Endpoint

Similar changes as Agentic Search but the endpoint is still synchronous:

Response field changed from products to data
Added pagination object
meta structure updated

Migration Steps for Search Endpoints

Implement async pattern:
- POST endpoint now returns execution_id instead of products
- Store the execution_id from the POST response
- Poll the GET status endpoint to check for completion
Update status checking:
- Use GET /v2/agentic-search/{execution_id} to check status
- Handle status values: "pending", "running", "completed", "failed"
- Implement polling logic (recommended: 5-10 second intervals)
Update response parsing:
- Products are now in data array (not products) when retrieved from status endpoint
- Access pagination from pagination object
- Original meta fields like totalItems and urls are in meta object (when available)
Handle new features:
- Monitor meta.progress for real-time progress updates
- Use timing information (duration_ms) for performance monitoring
- Consider using the list executions endpoint for execution management
Update error handling:
- Check status field for "failed" state
- Read error field for error messages
- Handle HTTP 202 Accepted status code for POST requests

Affiliate Endpoints

The Affiliate endpoint has undergone response format standardization.

Major Changes

Generate Affiliate Links Endpoint

v1:

POST /v1/affiliate

v2:

POST /v2/affiliate

Response Format: v1:

{
  "results": [{ "_truncated": "Additional items omitted for display" }],
  "total_processed": 3,
  "successful": 2,
  "failed": 1
}

v2:

{
  "data": [{ "_truncated": "Additional items omitted for display" }],
  "pagination": {
    "page": 1,
    "page_size": 3,
    "total_items": 3,
    "total_pages": 1,
    "has_next": false,
    "has_prev": false
  },
  "meta": {
    "total_processed": 3,
    "successful": 2,
    "failed": 1
  }
}

Key Changes:

Response field changed from results to data
Top-level summary fields (total_processed, successful, failed) moved to meta object
Added pagination object for consistency

Migration Steps for Affiliate Endpoints

Update response parsing: Change results to data
Update summary statistics: Access total_processed, successful, and failed from meta object instead of top-level

Usage Endpoints

The Usage endpoint has no changes.

General Changes

Response Format Standardization

All v2 endpoints follow a consistent response structure:

{
  "data": [{ "_truncated": "Additional items omitted for display" }],
  "pagination": {          // Pagination metadata (when applicable)
    "page": 1,
    "page_size": 10,
    "total_items": 100,
    "total_pages": 10,
    "has_next": true,
    "has_prev": false
  },
  "meta": {               // Additional metadata
    // Endpoint-specific metadata
  }
}

HTTP Status Codes

v2 uses 202 Accepted for async operations (POST requests that start async processing)
v1 used 200 OK for all successful requests

Error Handling

v1 Error Response:

{
  "error": {
    "code": "ERROR_CODE",
    "message": "Human-readable error description",
    "request_id": "req_truncated"
  }
}

v2 Error Response:

{
  "error": {
    "code": "ERROR_CODE",
    "message": "Human-readable error description",
    "request_id": "req_truncated"
  },
  "invalid_urls": [  // Additional context for validation errors
    "Index 0: https://invalid-url - invalid format"
  ]
}

Key Changes:

v2 may include additional error context fields for specific error types (e.g., invalid_urls for validation errors)
All error responses still include Request-ID header matching the request_id in the response body

Pagination Consistency

v1:

Mixed pagination structures across endpoints
Some used page, page_size, total_items, total_pages
Some used current_page, limit, etc.

v2:

Consistent pagination structure across all endpoints:
- page: Current page number
- page_size: Number of items per page
- total_items: Total number of items
- total_pages: Total number of pages
- has_next: Boolean indicating if next page exists
- has_prev: Boolean indicating if previous page exists

Query Parameter Naming

limit → page_size (in list and status endpoints)
All pagination-related parameters now consistently use page and page_size

Execution Status Values

v1:

"pending", "running", "completed", "failed", "processing" (ambiguous)

v2:

"pending": Execution created but not started (may be waiting for crawl)
"running": Execution actively processing
"completed": Execution finished successfully
"failed": Execution failed or was aborted

Note: When filtering by status in list endpoints, "processing" in v2 includes both "pending" and "running" states.

Support

If you encounter issues during migration or have questions about specific changes, please contact support at founders@getcatalog.ai or refer to the detailed endpoint documentation for v2.

Get Started

API Reference

Guides

Resources

Migration Guides

​Overview of Changes

​Discover Endpoints

​Major Changes

​1. Crawl Endpoint

​2. Get Crawl Status Endpoint

​3. List Crawl Executions Endpoint

​4. Get Products (Listings) Endpoint

​5. Get Collections Endpoint

​6. Get Vendors Endpoint

​Migration Steps for Discover Endpoints

​Extract Endpoints

​Major Changes

​1. Endpoint Path Changed

​2. Unified Extraction Endpoint

​3. Request Body Changes

​4. Response Format Changes

​5. Execution ID Format Changes

​6. List Executions Endpoint Changes

​7. Get Execution Status Changes

​8. Cancel Execution Changes

​Migration Steps for Extract Endpoints

​Search Endpoints

​Major Changes

​1. Agentic Search Endpoint - Now Asynchronous

​2. New v2 Endpoints for Agentic Search

​3. Get Execution Status Response Format

​4. Agentic Search Mini Endpoint

​Migration Steps for Search Endpoints

​Affiliate Endpoints

​Major Changes

​Generate Affiliate Links Endpoint

​Migration Steps for Affiliate Endpoints

​Usage Endpoints

​General Changes

​Response Format Standardization

​HTTP Status Codes

​Error Handling

​Pagination Consistency

​Query Parameter Naming

​Execution Status Values

​Support

Overview of Changes

Discover Endpoints

Major Changes

1. Crawl Endpoint

2. Get Crawl Status Endpoint

3. List Crawl Executions Endpoint

4. Get Products (Listings) Endpoint

5. Get Collections Endpoint

6. Get Vendors Endpoint

Migration Steps for Discover Endpoints

Extract Endpoints

Major Changes

1. Endpoint Path Changed

2. Unified Extraction Endpoint

3. Request Body Changes

4. Response Format Changes

5. Execution ID Format Changes

6. List Executions Endpoint Changes

7. Get Execution Status Changes

8. Cancel Execution Changes

Migration Steps for Extract Endpoints

Search Endpoints

Major Changes

1. Agentic Search Endpoint - Now Asynchronous

2. New v2 Endpoints for Agentic Search

3. Get Execution Status Response Format

4. Agentic Search Mini Endpoint

Migration Steps for Search Endpoints

Affiliate Endpoints

Major Changes

Generate Affiliate Links Endpoint

Migration Steps for Affiliate Endpoints

Usage Endpoints

General Changes

Response Format Standardization

HTTP Status Codes

Error Handling

Pagination Consistency

Query Parameter Naming

Execution Status Values

Support