Skip to main content
A schema is a JSON description of how to talk to a website. It lists every endpoint and action the site exposes — URLs, HTTP methods, required headers, parameters, and response shapes. Once a schema exists for a site, any agent can look it up and call the endpoints directly, without re-discovering them.

What’s in a schema

A schema for a Shopify store might include:
EndpointMethodURLWhat it does
shopify_productsGET/products.jsonList the product catalog
shopify_productGET/products/{handle}.jsonGet a single product
add_to_cartPOST/cart/add.jsAdd a variant to the cart
get_cartGET/cart.jsGet current cart contents
Each endpoint carries its parameters, required headers (like Content-Type: application/json), and enough metadata for an agent to construct a valid HTTP request.

Schema format

Schemas are JSON documents. Here’s a minimal example:
{
  "site": "allbirds.com",
  "intent_category": "commerce",
  "schema_format_version": "0.1",
  "name": "allbirds-products",
  "description": "Shopify-based product catalog and cart API for allbirds.com",
  "endpoints": [
    {
      "name": "shopify_products",
      "method": "GET",
      "url_template": "https://www.allbirds.com/products.json",
      "description": "Public product catalog from the Shopify AJAX API",
      "headers": {},
      "response_schema": {
        "type": "object",
        "fields": [
          { "name": "products", "type": "array" }
        ]
      }
    }
  ],
  "actions": [
    {
      "name": "add_to_cart",
      "method": "POST",
      "url_template": "https://www.allbirds.com/cart/add.js",
      "description": "Add a product variant to the shopping cart",
      "kind": "api_call",
      "transport": "api_call",
      "headers": { "Content-Type": "application/json" },
      "params": [
        { "name": "id", "in": "body", "type": "integer", "required": true, "description": "Product variant ID" },
        { "name": "quantity", "in": "body", "type": "integer", "required": true, "description": "Quantity to add" }
      ],
      "confidence": 0.98,
      "source": "shopify_ajax_api"
    }
  ]
}

Required fields

FieldTypeDescription
sitestringThe domain this schema covers (e.g. allbirds.com). Normalized to lowercase.
intent_categorystringMust match a category from the intent taxonomy (e.g. commerce, travel, jobs).
schema_format_versionstringMust be "0.1" (current version). Defaults to "0.1" if omitted.

Optional fields

FieldTypeDescription
namestringHuman-readable name for the schema
descriptionstringWhat this schema covers
versionnumberSchema version number (informational)
endpointsarrayRead-shaped API endpoints (GET requests that return data)
actionsarrayWrite-shaped or interactive actions (POST requests, cart operations, etc.)
requires_stealthbooleanWhether the site requires TLS fingerprinting to avoid bot detection
sessionobjectSession bootstrap configuration (cookies, headers needed before requests work)

Capability badges

The registry surfaces capability signals as badges so agents know what a request requires:
BadgeSource fieldWhat it means
Requires browserrequires_stealth: trueThe site needs Chrome-like TLS fingerprinting. Agents should use a stealth HTTP client or the CLI’s --stealth flag.
Requires sessionsession object presentThe site needs cookies or headers bootstrapped before API calls work (e.g. clearance cookies for Cloudflare-protected sites).
Has authheaders containing Authorization or X-API-* keysThe endpoint requires authentication headers.
These badges appear on the public schema card so agents can decide whether they can handle the site before downloading the full schema.

Contributing a schema

Anyone with an API key can push a schema. There are three paths:

Option 1: Discover with the CLI toolkit

# Install the CLI
go install github.com/hermai-ai/hermai-cli/cmd/hermai@latest

# Classify the site and probe standard paths
hermai detect https://example.com
hermai wellknown example.com

# Extract embedded data from a detail page — recognises 13 patterns
# including __NEXT_DATA__, JSON-LD, ytInitialData, __APOLLO_STATE__, etc.
hermai probe --body https://example.com/products/123 | hermai extract

# For dynamic pages (search, cart, filters) capture real XHR traffic
hermai intercept https://example.com/search?q=test

# Introspect GraphQL endpoints if you find one
hermai introspect https://example.com/graphql

# Write schema.json with the endpoints you confirmed, then push
hermai registry push schema.json
Each subcommand emits JSON that the next step can consume — no LLM key needed.

Option 2: Let your agent do it

npx skills add hermai-ai/hermai-skills
Claude Code, Codex, Cursor, and other agents with the Vercel skills CLI pick up the hermai-contribute skill and can run the discovery + push workflow autonomously. See hermai-ai/hermai-skills.

Option 3: Write and push manually

If you already know the site’s API (from DevTools, documentation, or testing), you can write the schema JSON yourself and push it via the API:
curl -X POST "https://api.hermai.ai/v1/schemas" \
     -H "Authorization: Bearer hm_sk_..." \
     -H "Content-Type: application/json" \
     -d '{
  "site": "allbirds.com",
  "intent_category": "commerce",
  "schema_format_version": "0.1",
  "name": "allbirds-shopify",
  "description": "Shopify AJAX API for allbirds.com — product catalog and cart",
  "endpoints": [
    {
      "name": "shopify_products",
      "method": "GET",
      "url_template": "https://www.allbirds.com/products.json",
      "description": "Public product catalog",
      "headers": {},
      "response_schema": {
        "type": "object",
        "fields": [
          { "name": "products", "type": "array" }
        ]
      }
    },
    {
      "name": "shopify_product",
      "method": "GET",
      "url_template": "https://www.allbirds.com/products/{handle}.json",
      "description": "Single product detail by handle",
      "variables": [
        { "name": "handle", "source": "path" }
      ]
    }
  ],
  "actions": [
    {
      "name": "add_to_cart",
      "method": "POST",
      "url_template": "https://www.allbirds.com/cart/add.js",
      "description": "Add a product variant to the shopping cart",
      "kind": "api_call",
      "transport": "api_call",
      "headers": { "Content-Type": "application/json" },
      "params": [
        { "name": "id", "in": "body", "type": "integer", "required": true, "description": "Product variant ID" },
        { "name": "quantity", "in": "body", "type": "integer", "required": true, "default": "1", "description": "Quantity to add" }
      ],
      "confidence": 0.98,
      "source": "manual"
    },
    {
      "name": "get_cart",
      "method": "GET",
      "url_template": "https://www.allbirds.com/cart.js",
      "description": "Get current cart contents",
      "kind": "api_call",
      "transport": "api_call",
      "headers": {},
      "confidence": 0.98,
      "source": "manual"
    },
    {
      "name": "update_cart",
      "method": "POST",
      "url_template": "https://www.allbirds.com/cart/update.js",
      "description": "Update item quantities by variant ID. Pass {variant_id: new_qty} for each line. Setting qty=0 removes the item.",
      "kind": "api_call",
      "transport": "api_call",
      "headers": { "Content-Type": "application/json" },
      "params": [
        { "name": "updates", "in": "body", "type": "object", "required": true, "description": "Map of variant ID to quantity, e.g. {\"41397031600208\": 2}" }
      ],
      "confidence": 0.98,
      "source": "manual"
    }
  ]
}'
A successful push returns the version hash and site:
{
  "success": true,
  "data": {
    "version_hash": "9a0a27c4085fc661...",
    "site": "allbirds.com",
    "created": true
  }
}
If validation fails, you get a specific error code:
{
  "success": false,
  "error": {
    "code": "UNKNOWN_CATEGORY",
    "message": "intent_category does not exist in current taxonomy"
  }
}
Start minimal. You don’t need to map every endpoint — even a single useful endpoint is a valid contribution. Other contributors can push updated versions with more coverage later.
Schemas go live immediately — no approval queue.

Validation rules

Every push runs through validation. If any check fails, the push is rejected with a specific error code. Structure checks:
RuleError codeDescription
Valid JSONINVALID_JSONThe body must parse as JSON
site presentMISSING_FIELDThe site field is required
intent_category presentMISSING_FIELDThe intent_category field is required
Category existsUNKNOWN_CATEGORYThe intent_category must match a slug in the taxonomy. Call GET /v1/categories to see valid values.
Format versionUNSUPPORTED_FORMATschema_format_version must be "0.1"
Content governance:
RuleError codeDescription
Forbidden namesFORBIDDEN_NAMESchema names must not describe circumvention techniques. Rejected patterns: bypass, circumvent, cf-clearance, anti-bot, rotate-ip. Name the schema after what it does (e.g. shopify-products), not how it beats a defense.
Forbidden fieldsFORBIDDEN_FIELDSchemas must not contain credential or operational secret fields at any nesting depth. See forbidden fields list below.
Exclusion listSITE_EXCLUDEDSome sites have requested removal from the registry. Pushing a schema for an excluded site is rejected.

Forbidden fields

These JSON keys are rejected if found anywhere in the schema, at any depth:
FieldWhy it’s forbidden
proxy_credentialsActual proxy login credentials — a secret that must never be published
residential_proxyProxy infrastructure configuration — operational, not schema data
clearance_cookie_jsJavaScript used to generate clearance cookies — an arms-race implementation detail
clearance_cookiesEphemeral session cookies — site-specific and time-limited, not useful to other users
bypass_methodDescribes a circumvention technique — violates naming policy
stealth_scriptCustom anti-detection JavaScript — operational, not schema data
tls_fingerprintSpecific TLS profile identifier — this changes frequently and is better managed by the CLI runtime than baked into a schema
Capability signals are allowed. The distinction is between metadata that helps agents understand what a site requires (like requires_stealth: true) and operational secrets that describe how to bypass a defense (like proxy_credentials: "user:pass@host"). The former belongs in schemas; the latter belongs in the CLI runtime.

Content addressing

Each schema is hashed (SHA-256 of canonical JSON with sorted keys). The hash becomes the version ID. Pushing the same content twice for the same site is a no-op — the registry recognizes it as a duplicate.

Verified badge

Schemas are community-contributed and unverified by default. The Hermai team reviews schemas and awards a verified badge to those that meet quality standards. Agents can filter to verified schemas only:
# Only verified schemas
curl "https://api.hermai.ai/v1/schemas?verified=true"
Or include everything (the default):
# All schemas, verified and unverified
curl "https://api.hermai.ai/v1/schemas"
API keys have a schema access toggle in the dashboard that controls whether unverified schemas are included in catalog responses for that key.

Schema versioning

Each push creates a new version. The registry tracks which version is “latest” for each site. When you query the catalog, you get the latest version by default. Version history is available:
curl "https://api.hermai.ai/v1/schemas/allbirds.com/versions"

Public card vs full package

The registry exposes two views of every schema: Public card (no API key needed) — metadata about what the schema does:
  • Site, category, name, description
  • Endpoint names, methods, descriptions
  • Variable names and sources (path/query/body)
  • Query parameter keys and whether they’re required
  • Response field names and types (top level only)
  • Capability badges (requires browser, requires session, has auth)
Full package (API key + intent required) — everything an agent needs to execute:
  • URL templates with actual URLs
  • Header values
  • Body templates
  • Session configuration
  • Variable patterns and defaults
The split is intentional: the card answers “is this useful?”, the package answers “how do I run it?”
# Public card (no auth)
curl "https://api.hermai.ai/v1/schemas/allbirds.com"

# Full package (auth + intent required)
curl -H "Authorization: Bearer hm_sk_..." \
     -H "X-Hermai-Intent: downloading allbirds schema for a price comparison agent" \
     "https://api.hermai.ai/v1/schemas/allbirds.com/package"

Credits for contributions

You earn +50 credits per site when you contribute a schema. Credits accumulate and become your starting balance when hosted execution launches in a future phase.