Skip to content

feat(catalog): Catalog capability for product discovery#55

Open
igrigorik wants to merge 19 commits intomainfrom
feat/catalog-capability
Open

feat(catalog): Catalog capability for product discovery#55
igrigorik wants to merge 19 commits intomainfrom
feat/catalog-capability

Conversation

@igrigorik
Copy link
Contributor

@igrigorik igrigorik commented Jan 15, 2026

Summary

Introduces the dev.ucp.shopping.catalog capability, enabling platforms to search merchant product catalogs and retrieve product/variant details. This fills the "discovery gap" before checkout—while checkout assumes you already have a variant ID, catalog enables scenarios like "find me blue running shoes under $150" that lead to cart building and purchase.

The capability provides two operations: free-text search with filters and pagination, and direct product/variant lookup by ID. Both return a consistent Product structure containing variants, media, options, and pricing that seamlessly connects to checkout's line_items[].item.id.

Motivation

UCP's checkout capability assumes the platform already knows what to buy. But agents and platforms need to discover products first—browsing catalogs, comparing options, filtering by price or category. Even when a product ID is already known (e.g., from a product feed integration), platforms need live lookups to fetch current facts—price, availability, and other details that may have changed. Without a standardized catalog interface, every integration requires custom product APIs or out-of-band discovery mechanisms—N custom integrations instead of one standardized protocol.

Goals

  • Enable product discovery via free-text search with category and price filters
  • Support semantic search via optional intent context field
  • Provide real-time product/variant lookup for cart validation and deep links
  • Return variant IDs directly usable in checkout line_items[].item.id
  • Establish consistent Product/Variant schema for cross-merchant compatibility

Non-Goals

  • Inventory management or stock updates (read-only catalog access)
  • Product creation, modification, or deletion (merchant-side operations)
  • Recommendations or personalization algorithms (merchant implementation detail)
  • Full-text search ranking or relevance tuning (merchant implementation detail)

Detailed Design

Operations

Operation Description
search_catalog Free-text search with filters, context, and pagination
get_catalog_item Retrieve product or variant by Global ID

Data Structures

Product (catalog entry):

├─ id, title, description, url, category
├─ price: PriceRange (min/max across variants)
├─ media[]: images, videos, 3D models (first = featured)
├─ options[]: dimensions like Size, Color
├─ variants[]: purchasable SKUs (first = featured)
├─ rating: aggregate reviews
└─ metadata: merchant-defined extensions

Variant (purchasable SKU):

├─ id: used as item.id in checkout
├─ sku, barcode: inventory identifiers
├─ title, price, availability
├─ selected_options[]: option values for this variant
├─ media[], rating, tags, metadata
└─ seller: optional marketplace context

Key Behaviors

Ordering Convention: media[] and variants[] are ordered arrays. Merchants SHOULD return featured items first; platforms SHOULD treat first element as featured.

ID Resolution: get_catalog_item accepts product OR variant IDs:

  • Product ID → returns product with representative variant set
  • Variant ID → returns parent product with only requested variant

Error Handling: NOT_FOUND returns HTTP 200 with messages[].type: "error" (not 404). All catalog errors use severity: "recoverable" for programmatic handling.

Transport Bindings

REST:

POST /catalog/search     → search_catalog
GET  /catalog/item/{id}  → get_catalog_item

MCP (JSON-RPC):

search_catalog
get_catalog_item

Risks and Mitigations

  • Complexity: Adding another capability increases protocol surface area. Mitigation: Catalog is cleanly scoped with only 2 operations and reuses existing patterns (context, messages, pagination).
  • Schema Drift: Product schemas vary wildly across merchants. Mitigation: Core fields are minimal and universal; metadata and additionalProperties enable merchant extensions without protocol changes.
  • Search Quality: Free-text search results depend entirely on merchant implementation. Mitigation: Protocol defines interface only; merchants own ranking/relevance. intent field enables semantic hints without prescribing implementation.
  • Backward Compatibility: N/A—new capability, no breaking changes to existing checkout flow.
  • Performance: Large catalogs could return excessive data. Mitigation: Pagination with default 10, max 25 results; representative variant sets for products with many variants.

Type of change

  • New feature (non-breaking change which adds functionality)

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • New and existing unit tests pass locally with my changes

@igrigorik igrigorik requested a review from sinhanurag January 15, 2026 17:06
@igrigorik igrigorik self-assigned this Jan 15, 2026
@igrigorik igrigorik requested a review from a team January 15, 2026 17:06
@igrigorik igrigorik added the TC review Ready for TC review label Jan 15, 2026
@igrigorik igrigorik added this to the Working Draft milestone Jan 16, 2026
@amithanda
Copy link
Contributor

Thanks Ilya. I am assuming we will add local store support in a later PR?
I will do a pass by Tuesday.

Copy link

@knightlin-shopify knightlin-shopify left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the expected language handling for text content coming from the business (e.g., product/variant descriptions and message content)? We should require an explicit content language provided by the platform/agent (via context) that reflects the user’s selected display language which cannot be inferred from context.country. The agent should render the content as-is and avoid automatic translation, since some fields (e.g., product name, variant name, legal/safety/liability disclaimers) must not be translated and may appear mixed within the same text blob.

Copy link
Contributor

@raginpirate raginpirate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again... incredible to see these primitives come to life so "simply"!
Just a couple questions about unification I spotted.

@TioBorracho
Copy link

TioBorracho commented Jan 20, 2026

Have you considered adding either specific alternate id filter or having an alternate id search instead of free text? I am thinking in cases where you already have the SKU or GTIN, but not the variantID (or generic global id). Searching by free text can return more results if the string was modified from the original title or description, but searching by alternate ids would be more accurate

@nicolasgarnier
Copy link

nicolasgarnier commented Jan 22, 2026

Hi team,

I am currently exploring the Catalog implementation specifically for local shopping use cases. I’d appreciate some clarification on how the current PR intends to handle multi-location merchants and local product availability:

  1. Location-Aware Search: How does the protocol envision handling searches for items at specific physical locations (e.g., for multi-location merchants)?
    Is the expectation that location context should be passed via free-text queries, or is there an intended way to pass geo-coordinates/location IDs so the merchant can filter availability on their end?

  2. Product-Level Extensions for Local Availability: I see that a metadata attribute exists for extensions, and the current Fulfillment extension supports a pickup option. However, Fulfillment seems currently optimized for the order/checkout phase.
    Is there an existing or planned definition for Product-level extensions to communicate local availability (e.g., "In stock at Store A," "Out of stock at Store B")?
    Specifically, can the metadata block be used to expose a list of store locations where a product is physically available for immediate pickup?

The goal is to enable AI agents to reliably find items available "near the user." If the schema doesn't explicitly support local inventory flags at the Catalog level, agents might struggle to differentiate between shipping-only items and those available for same-day local pickup. Is this distinction currently within the scope of this PR?

Introduces `dev.ucp.shopping.catalog` capability enabling platforms to search
business product catalog and perform targeted product+variant lookups.

Checkout capability assumes the platform already knows what item to buy (via
variant ID). Catalog capability fills the discovery gap—enabling scenarios like
"find me blue running shoes under $150" that lead to cart building and checkout.

  Product (catalog entry)
    ├─ id, title, description, url, category
    ├─ price: PriceRange (min/max across variants)
    ├─ media[]: images, videos, 3D models (first = featured)
    ├─ options[]: dimensions like Size, Color
    ├─ variants[]: purchasable SKUs (first = featured)
    ├─ rating: aggregate reviews
    └─ metadata: merchant-defined data

  Variant (purchasable SKU)
    ├─ id: used as item.id in checkout
    ├─ sku, barcode: inventory identifiers
    ├─ title: "Blue / Large"
    ├─ price: Price (amount + currency in minor units)
    ├─ availability: { available: bool }
    ├─ selected_options[]: option values for this variant
    ├─ media[], rating, tags, metadata
    └─ seller: optional marketplace context

  - Free-text query with semantic search support
  - Filters: category (string), price (min/max in minor units)
  - Context: country, region, postal_code, intent
  - Cursor-based pagination (default 10, max 25)

  - Accepts product ID OR variant ID
  - Always returns parent product with context
  - Product ID → variants MAY be representative set
  - Variant ID → variants contains only requested variant
  - NOT_FOUND returns HTTP 200 with error message (not 404)

Location and market context unified into reusable types/context.json:

  {
    "country": "US",      // ISO 3166-1 alpha-2
    "region": "CA",       // State/province
    "postal_code": "..."  // ZIP/postal
  }

Catalog extends with 'intent' for semantic search hints.

REST:
  POST /catalog/search     → search_catalog
  GET  /catalog/item/{id}  → get_catalog_item

MCP (JSON-RPC):
  search_catalog
  get_catalog_item
  Inline object definitions in search_request.filters weren't rendered
  in generated docs (showed as plain "object" without properties).

  Fix by extracting to referenceable schemas:
  - search_filters.json: category + price filter definitions
  - price_filter.json: min/max integer bounds (distinct from price_range
    which uses full Price objects with currency)
  - dev.ucp.shopping.catalog.search
  - dev.ucp.shopping.catalog.lookup

  Docs:
  - Restructure to catalog/ directory
  - index.md: shared concepts (Product, Variant, Price, Messages)
  - search.md, lookup.md: individual capability docs
  - rest.md, mcp.md: transport bindings
  Add `language` to shared Context type for requesting localized content.
  Uses IETF BCP 47 language tags (same format as HTTP Accept-Language).

  Key design points:
  - For REST, platforms SHOULD fall back to Accept-Language header when
    field is absent; when provided, context.language overrides header
  - Same provisional hint pattern: businesses MAY return different
    language if requested language unavailable
  - Shared across catalog, checkout, cart (applies to buyer journey)
  Support both endpoints for consistency with cart/checkout:
  - GET /catalog/item/{id}?country=US&region=CA&language=es - location context
    via query params, language overrides Accept-Language header
  - POST /catalog/lookup - full context in body (including sensitive `intent`)

  Context is extensible, but only a well-known subset (country, region,
  postal_code, language) is supported via GET query params. POST supports
  the complete context object.

  Also normalizes error codes to lowercase snake_case for consistency with
  checkout (NOT_FOUND → not_found, DELAYED_FULFILLMENT → delayed_fulfillment).
Context security (catalog, checkout, schema):
- Clarify context signals are provisional hints—not authorization
- Enforcement MUST occur at checkout with authoritative data
- May be ignored if inconsistent with stronger signals (fraud rules,
  export controls, authenticated account)

Currency support:
- Add optional `currency` field to context for multi-currency markets
- Supports scenarios like Switzerland (CHF default, EUR preference)
- Added to REST GET query params and schema

Security:
- Add HTML sanitization guidance to product/variant descriptions
- Platforms MUST strip scripts, event handlers, untrusted elements

Schema refinements:
- rating.json: add `scale_min` with default: 1
- price_filter.json: clarify context→currency determination
- product.json: improve `handle` description for SEO vs API usage

Consistency:
- Error codes normalized to lowercase snake_case (not_found)
- Fix invalid product example (missing description, price)
- Update media/variants ordering language
@igrigorik igrigorik force-pushed the feat/catalog-capability branch from 8f27dea to f92b9db Compare January 30, 2026 14:59
@igrigorik
Copy link
Contributor Author

@knightlin-shopify @maximenajim ty for thorough reviews!

Have you considered adding either specific alternate id filter or having an alternate id search instead of free text? I am thinking in cases where you already have the SKU or GTIN, but not the variantID (or generic global id). Searching by free text can return more results if the string was modified from the original title or description, but searching by alternate ids would be more accurate

@TioBorracho I think the best way to approach is this via existing "by ID" lookup, where ID can match multiple keys: product, variant, SKU, GTIN, etc.

@igrigorik
Copy link
Contributor Author

I am currently exploring the Catalog implementation specifically for local shopping use cases. I’d appreciate some clarification on how the current PR intends to handle multi-location merchants and local product availability:

  1. Location-Aware Search: How does the protocol envision handling searches for items at specific physical locations (e.g., for multi-location merchants)?
    Is the expectation that location context should be passed via free-text queries, or is there an intended way to pass geo-coordinates/location IDs so the merchant can filter availability on their end?
  2. Product-Level Extensions for Local Availability: I see that a metadata attribute exists for extensions, and the current Fulfillment extension supports a pickup option. However, Fulfillment seems currently optimized for the order/checkout phase.
    Is there an existing or planned definition for Product-level extensions to communicate local availability (e.g., "In stock at Store A," "Out of stock at Store B")?
    Specifically, can the metadata block be used to expose a list of store locations where a product is physically available for immediate pickup?

The goal is to enable AI agents to reliably find items available "near the user." If the schema doesn't explicitly support local inventory flags at the Catalog level, agents might struggle to differentiate between shipping-only items and those available for same-day local pickup. Is this distinction currently within the scope of this PR?

@nicolasgarnier this is perfect use case for an extension that builds on core schema we define here. As a napkin sketch..

  • Capability: dev.ucp.shopping.catalog.local_inventory
  • Extends: dev.ucp.shopping.catalog.search, dev.ucp.shopping.catalog.lookup

I would then augment the availability field with stores or something similar, e.g.

  {
    "availability": {
      "available": true,
      "stores": [
        {
          "id": "store_sf_downtown",
          "name": "San Francisco - Downtown",
          "address": { ... },
          "pickup_available": true,
          ...
        },
        {
          "id": "store_palo_alto",
          "name": "Palo Alto",
          "address": { ... },
          },
          "pickup_available": false,
          ...
        }
      ]
    }
  }

For targeting, we already provide context field which allows country > region > postal code resolution. If that's not enough, that can be enhanced by the extension as well, e.g. by adding lat/long and search radius.

@igrigorik
Copy link
Contributor Author

Addressed all the feedback, I believe this should be good to land as working draft. Key updates from original..

Split into two independent capabilities that implementers can adopt and advertise independently:

  • dev.ucp.shopping.catalog.search — free-text search with filters and pagination
  • dev.ucp.shopping.catalog.lookup — lookup retrieval by product/variant ID

A few context enhancements...

  • language — preferred presentation language: BCP 47 tags (e.g., 'es', 'fr-CA')
  • currency — preferred presentment currency: ISO 4217 codes (e.g., 'EUR', 'USD')

Support for GET+POST lookup operations

  • GET /catalog/item/{id}?country=US&language=en — supports well-known subset of context params
  • POST /catalog/lookup — full context body

GET matches our pattern for GET cart/checkout and adds support for well-known set of context query params. Because context is extensible we can't statically predefine the full list, and that may not be desirable to begin with because some context data might contain sensitive information that should not be communicated via query params. POST allows you to pass in the id and full context object. I think we could get away with POST only, but GET has its perks in terms of DX, cacheability, etc -- open to being swayed on dual GET+POST.

@TioBorracho
Copy link

I left a comment on the file,but I think the rest path for get should be items, and not item.

  Product.price → Product.price_range
  Product.list_price → Product.list_price_range

  The fields reference PriceRange schema (min/max), so naming should
  reflect this. Variant.price and Variant.list_price remain unchanged
  as they reference single Price (not range).
  Introduce category.json schema to support multiple product
  taxonomies (google_product_category, shopify, merchant).

  Schema changes:
  - New: category.json with {value, taxonomy} structure
  - Product: category string → category[] (Category[])
  - Variant: category string → category[] (Category[])
  Allow batch (multi ID) lookup support. Enables single and multi
  item lookup and aligns request and response shapes to search.

  Discussed at TC, agreed on POST-only:
  - Keeps request/response modeling symmetric (context in body)
  - Avoids special-casing query params for extensible context fields
  - Simplifies overall protocol surface
  - Single item GET can be added later if necessary

  Updated API:
  - REST: `POST /catalog/lookup` accepts `ids` array + `context` object
  - MCP: `catalog.ids` inside catalog object (matches search pattern)
  - Response returns `products` array (symmetric with search)

  Identifier flexibility:
  - MUST support product ID and variant ID
  - MAY support secondary identifiers (SKU, handle, etc.)
  - Secondary identifiers must be fields on returned product object
  - Client correlates results by matching fields (no guaranteed order)
Copy link

@alex-jansen alex-jansen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The definitions under spec/schemas/shopping/types are (almost) the same as those under source/schemas/shopping/types. I assume (unless I missed something) that the latter is the source of truth and the former should be deleted before merging this PR?

@igrigorik
Copy link
Contributor Author

@alex-jansen good catch, spec/ is no longer necessary. We serve and build everything from source/ now. Removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

TC review Ready for TC review

Projects

None yet

Development

Successfully merging this pull request may close these issues.