Skip to content

Universal Requirements — 18 Items

Eighteen items mapped to the five pillars.

  • Required — established public standard (RFC, near-universal). Shipping without it is a defect.
  • Recommended — published spec or widely-adopted vendor convention. Exceptions need written rationale.
  • Conditional — applies only to specific property types (APIs, transactional sites, sites that originate bot traffic).
  • Emerging — actively-shipping draft standards we deploy behind feature flags.
  • Framix convention — house rule. Not a public standard; client-portable but not interoperable by default.

01. robots.txt with AI bot allowlist + Content Signals — required. Physical file at site root. Explicit User-agent + Allow for AI crawlers (ChatGPT-User, OAI-SearchBot, ClaudeBot, PerplexityBot, Google-Extended, etc.). Sitemap line. Cloudflare’s Content Signals comment block (ai-train / search / ai-input = yes/no) above the user-agent rules. Allowlist reviewed quarterly. Authority: RFC 9309 + Cloudflare convention; coordinating with IETF AI Preferences WG (draft-ietf-aipref-vocab).

02. sitemap.xml — required. All indexable URLs, current <lastmod>, separate sitemaps per post type. Excludes thin pages, tag archives, search, pagination. Authority: sitemaps.org.

03. Canonical URLs on every indexable page — required. Absolute, https, consistent host. Cloudflare’s Redirects for AI Training (April 2026 beta) elevates these into 301s for verified training crawlers — keep canonicals correct or live with the consequences in training data.

04. Link response header — recommended. Points to markdown alternate, canonical, and next/prev. Emitted on every HTML response. Authority: RFC 8288.

05. Markdown via content negotiation and URL-suffix fallback — recommended. Accept: text/markdown returns Content-Type: text/markdown; charset=utf-8, Vary: accept, x-markdown-tokens: N. Plus a parallel /index.md URL on every page (rewrite rule, not static duplicate files). The fallback is mandatory: of seven coding agents Cloudflare tested in February 2026, only Claude Code, OpenCode, and Cursor send the Accept header; the rest need a URL they can guess. Authority: Cloudflare convention.

06. HTML stub instructions for non-markdown agents — Framix convention. The <body> of HTML responses opens with a hidden-but-visible-to-agents block: “STOP. This is the HTML version. Get this page as Markdown at /index.md or with Accept: text/markdown.” Mirrors the Cloudflare docs pattern. Costs nothing, improves outcomes for agents that don’t negotiate.

07. /llms.txt — recommended. Markdown, site-root, under 50 KB. H1, blockquote summary, curated link list. Independent studies through Q1 2026 show no measurable citation uplift from llms.txt alone. Ship it because the cost is negligible, it lands well with developer-tools clients (Anthropic, Vercel, Cloudflare, Mintlify all use it), and the format is becoming a discoverability convention. Don’t sell it as a citation lever. Authority: llmstxt.org community proposal.

08. Hierarchical llms-full.txt per section — Framix convention. Per-section files (e.g. /blog/llms-full.txt, /docs/llms-full.txt). Hard-cap 200 KB each. Excludes thin / duplicate / deprecated pages. Mirrors the Cloudflare docs pattern.

09. Semantic HTML, SSR-first — required. All indexable content in the first response. No JS-only rendering. Proper landmarks: <article>, <main>, <nav>, heading hierarchy. <script type="application/ld+json"> blocks preserved through markdown conversion (Cloudflare’s converter does this automatically; static-export pipelines need the equivalent).

10. Quick-answer paragraph (40–80 words) — Framix convention. First paragraph directly answers the core query. No preamble, no “in this article we’ll cover.”

11. Schema.org JSON-LD — required. Organization + WebSite + Article / Product / Service + BreadcrumbList. Person schema with credentials on every byline. Authority: schema.org.

12. Author byline + bio + credentials — recommended. Named author, bio block, LinkedIn, credentials. Primary-source citations on data-driven pieces (government, academic, registry data — not blog-citing-blog).

13. Freshness + deprecation governance — recommended. dateModified in schema and visible text. Deprecated posts: noindex, banner, replacement URL in CMS field, Link: rel="successor-version", <link rel="canonical"> pointed at the replacement. Monthly scan flags content older than 12 months. Cloudflare’s Redirects for AI Training turns canonicals into 301s for training crawlers, so this hygiene affects model behaviour, not just SEO.

14. content-signal response header on HTML — recommended. Every public HTML response: content-signal: ai-train=yes, search=yes, ai-input=yes (or per-client policy). Excluded from admin, REST, cron, and authenticated paths. Authority: Cloudflare convention; IETF AI Preferences WG (draft-ietf-aipref-vocab) in progress.

15. Web Bot Auth verification (origin-side) — conditional. For sites that originate bot or agent traffic to other origins: publish keys at /.well-known/http-message-signatures-directory, sign requests per RFC 9421 with the Signature-Agent header. Cloudflare and Google co-author the IETF drafts; the working group was chartered early 2026. Google is testing it in production for its agent fleet. Authority: IETF Web Bot Auth WGdraft-meunier-web-bot-auth-architecture-05, draft-meunier-http-message-signatures-directory-05.

16. /.well-known/agent-statement.json — Framix convention. JSON: site identity, AI usage policy summary, endpoint URLs (sitemap, llms, api-catalog, mcp). CMS-editable. Glue between the Framix scorecard, audits, and client policy.

17. /.well-known/api-catalog — conditional. JSON pointer to OpenAPI specs, webhook endpoints, partner docs. Pair with /.well-known/oauth-authorization-server and /.well-known/oauth-protected-resource when APIs are gated. Skip on pure content sites. Authority: RFC 9727; RFC 8414; RFC 9728.

18. MCP Server Card + Agent Skills index — emerging. MCP Server Card at /.well-known/mcp/server-card.json and a domain catalog at /.well-known/mcp/catalog. Agent Skills index at /.well-known/agent-skills/index.json (Cloudflare RFC v0.2, March 2026; rename from /skills/). Ship for docs sites, product catalogs, regional data, configurable APIs. Both are draft standards; deploy behind a feature flag, monitor SEP/RFC churn quarterly. Authority: MCP SEP-2127; Cloudflare Agent Skills Discovery RFC.

Commerce protocols (x402, UCP, ACP, A2A) follow a separate decision tree — see Pillar 5 § Commerce track.