Skip to content

geo: agent-readiness foundation (robots, api-catalog, agent-skills)#220

Open
vcoisne wants to merge 1 commit into
PlakarKorp:mainfrom
vcoisne:geo/agent-readiness-foundation
Open

geo: agent-readiness foundation (robots, api-catalog, agent-skills)#220
vcoisne wants to merge 1 commit into
PlakarKorp:mainfrom
vcoisne:geo/agent-readiness-foundation

Conversation

@vcoisne
Copy link
Copy Markdown

@vcoisne vcoisne commented May 10, 2026

PR 1 — geo: agent-readiness foundation (robots, api-catalog, agent-skills)

Branch: geo/agent-readiness-foundation
Patch: 01-agent-readiness-foundation.patch
Closes: 4 of the 9 isitagentready.com audit findings

Why

Agent-readiness is the GEO (Generative Engine Optimization) equivalent of a Lighthouse audit: a battery of standards-based checks that determine how well a site is grounded by AI assistants and traversed by autonomous agents. This PR closes the four findings that resolve to static files only — no infrastructure dependencies, no auth dependencies, no service builds.

What changes

File Purpose Spec
layouts/robots.txt Hugo template emitting Content-Signal directives + per-bot overrides draft-romm-aipref-contentsignals
static/.well-known/api-catalog linkset+json declaring docs and status endpoints RFC 9727, RFC 9264
static/.well-known/agent-skills/index.json v0.2.0 skills discovery index with sha256 digests agent-skills-discovery-rfc
static/.well-known/agent-skills/{install-plakar,backup-postgres,restore-snapshot}/SKILL.md Three authoritative how-tos for AI assistants same

robots.txt stance — please confirm

The template asserts:

User-agent: *
Content-Signal: search=yes, ai-input=yes, ai-train=no

…with explicit Disallow: / overrides for Google-Extended, GPTBot, and ClaudeBot (training crawlers), and Allow: / for OAI-SearchBot, Claude-User, and PerplexityBot (retrieval crawlers).

Rationale: it's the OSS-default of "be visible in search and AI answers, do not contribute to training corpora". If product strategy is "maximum reach including training", flip ai-train=noai-train=yes and remove the per-bot training disallows.

Cloudflare Transform Rule (required for the api-catalog)

GitHub Pages serves static/.well-known/api-catalog (no extension) as application/octet-stream. The audit scanner is content-aware enough to still pass this, but spec-compliant clients want application/linkset+json. Add this Transform Rule in the Cloudflare dashboard after merging:

  • When: (http.request.uri.path eq "/.well-known/api-catalog")
  • Then: Set static Content-Type: application/linkset+json

Same pattern for the other extension-less files we'll add in subsequent PRs (oauth-protected-resource, etc.).

Testing

Local Hugo build:

npm run build
ls -la public/.well-known/
cat public/robots.txt

After deploy:

curl -s https://plakar.io/robots.txt | head -20
curl -s https://plakar.io/.well-known/api-catalog | jq .linkset
curl -s https://plakar.io/.well-known/agent-skills/index.json | jq .skills

# Re-run the original audit
curl -sX POST https://isitagentready.com/api/scan \
  -H 'Content-Type: application/json' \
  -d '{"url":"https://plakar.io"}' | jq '.checks.botAccessControl.contentSignals, .checks.discovery.apiCatalog, .checks.discovery.agentSkills'

All three checks should report "status": "pass".

Maintaining the agent-skills index

Whenever a SKILL.md is added or edited, recompute its sha256 and update index.json. Suggested CI check:

sha256sum static/.well-known/agent-skills/*/SKILL.md

Diff each digest against the committed index.json; fail if drift.

What's intentionally NOT here

  • The fourth quick-win (markdown content negotiation) is handled at the Cloudflare layer (enable "Markdown for Agents" in the dashboard, or deploy a Worker). It's not a Hugo concern, so it doesn't belong in this repo.
  • OAuth/OIDC discovery (issues 5 and 6) lives on auth.plakar.io, not the marketing site.
  • The MCP server card is a separate PR (fix plakar create example #2) because its dependence on a not-yet-built MCP server is worth review on its own.

…log, agent-skills)

Closes 4 of the 9 findings from the isitagentready.com audit.

robots.txt
  Switches from the empty default to a Hugo template emitting Content-Signal
  directives per draft-romm-aipref-contentsignals. Site stance:
    - search=yes      (keep traditional SEO)
    - ai-input=yes    (let assistants ground answers in our content)
    - ai-train=no     (do not contribute to training corpora)
  Per-bot overrides cover Google-Extended, GPTBot, OAI-SearchBot, ClaudeBot,
  Claude-User, and PerplexityBot.

/.well-known/api-catalog
  RFC 9727 linkset+json declaring service-doc (docs.plakar.io and
  /control-plane-docs/) and the status page. Lets agents discover the API
  surface without scraping.

  Note: GitHub Pages will serve this as application/octet-stream by default.
  Add a Cloudflare Transform Rule to set Content-Type to application/linkset+json
  on this path (covered in the PR description).

/.well-known/agent-skills/
  Skills discovery index (v0.2.0) plus three SKILL.md authoritative how-tos:
  install-plakar, backup-postgres, restore-snapshot. SHA-256 digests in the
  index match the file contents.

  These give AI assistants a stable, citable source for 'how do I do X with
  Plakar?' questions, which is the core GEO play.

Re-run the audit after deploy:
  curl -sX POST https://isitagentready.com/api/scan \
    -H 'Content-Type: application/json' \
    -d '{"url":"https://plakar.io"}' | jq '.checks'

Expected to flip checks.botAccessControl.contentSignals,
checks.discovery.apiCatalog, and checks.discovery.agentSkills to pass.
@poolpOrg
Copy link
Copy Markdown
Contributor

Im going to review this with a clear mind

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants