ADR-067 β UAC as LLM-Optimized Executable Contract
1. Statusβ
Status: Proposed
Date: 2026-04-25
Deciders: STOA Core Team
Related decisions: ADR-012 MCP RBAC Architecture, ADR-021 UAC-Driven Observability, ADR-022 UAC Tenant Architecture, ADR-030 AI-Native Context Management, ADR-045 stoa.yaml Declarative API Specification, ADR-051 Lazy MCP Discovery, ADR-063 SDD Level 1 + stoa-impact MCP knowledge agent.
2. Contextβ
STOA uses UAC (Universal API Contract) as the central product contract behind the principle Define once, expose everywhere.
A UAC can be projected into multiple runtime and product surfaces:
- REST routes
- MCP tools
- security policies
- contract tests
- smoke tests
- documentation and catalog entries
- observability labels and traces
STOA is also developed and maintained with significant AI assistance. That makes the contract more than a runtime artifact. It must be readable, stable, and verifiable by three consumers:
- the STOA runtime
- humans and architects
- LLMs and AI agents
Existing UAC fields already describe technical routing, schemas, lifecycle, classification, and policies. They are not always enough for an agent to decide whether a generated MCP tool should be used, when it should be used, and what behavior must remain stable.
This ADR does not replace the existing llm_config capability used for LLM backend contracts. llm_config describes an AI backend exposed by STOA. The metadata proposed here describes how an API operation should be understood and governed when projected to an agent-facing tool.
3. Problemβ
A classic API contract can be sufficient for a gateway but insufficient for an LLM.
An agent needs compact operational intent:
- what the operation does
- when to use it
- whether it changes state
- whether an autonomous agent may call it safely
- whether a human approval step is required
- what a valid input/output example looks like
Without this information, AI agents can:
- choose the wrong tool
- call endpoints at the wrong time
- perform writes when a read was expected
- generate inconsistent code around product invariants
- treat dangerous operations as normal helper functions
- produce rewrites that are technically valid but contractually wrong
The gap is operation-level. A single API contract can contain both a safe GET /customers/{id} and a destructive DELETE /customers/{id}. These operations cannot share the same intent, side effects, approval rule, or examples.
4. Decisionβ
STOA will treat UAC as an LLM-optimized executable contract.
In v1, STOA adds a compact llm metadata block at the endpoint / operation level. The block is recommended for endpoints exposed as MCP tools and may be validated in warning mode. It is not immediately mandatory for all existing UAC endpoints.
The decision is deliberately small:
- UAC remains the primary product contract.
- MCP tools remain projections of UAC operations.
- Smoke tests remain runtime proof that a UAC is actually exposed.
- Agent-facing features should be traceable to UAC metadata, not only to code.
- V1 is warning / recommended.
- V2 may make the metadata mandatory for new endpoints exposed through MCP.
The guiding rule is:
The UAC describes. The smoke test proves.
5. UAC LLM Metadata v1β
Scopeβ
The v1 metadata lives on each endpoint:
{
"path": "/customers/{id}",
"methods": ["GET"],
"backend_url": "https://backend.example.com/customers",
"operation_id": "get_customer",
"llm": {
"summary": "Retrieve one customer by id.",
"intent": "Use when an agent needs customer details before answering, checking eligibility, or preparing a follow-up action.",
"tool_name": "customer_get_customer",
"side_effects": "read",
"safe_for_agents": true,
"requires_human_approval": false,
"examples": [
{
"input": {"id": "cust_123"},
"expected_output_contains": {"id": "cust_123"}
}
]
}
}
Required fields when llm is presentβ
| Field | Type | V1 rule |
|---|---|---|
summary | string | Short description of what the operation does. Target: one sentence, at most 160 chars. |
intent | string | When an agent should use this operation. Target: one compact sentence, at most 360 chars. |
tool_name | string | Stable MCP tool name for the operation. Transport adapters may map it when a client has stricter name rules. |
side_effects | enum | One of none, read, write, destructive. |
safe_for_agents | boolean | Whether an autonomous agent may call the operation under normal policy. |
requires_human_approval | boolean | Whether a human approval step is required before execution. |
examples | array | One or two minimal examples with valid input and expected_output_contains. |
side_effects semanticsβ
| Value | Meaning | Default MCP hint |
|---|---|---|
none | Pure operation, no external read or write. | readOnlyHint=true, openWorldHint=false, destructiveHint=false |
read | Reads external data without changing it. | readOnlyHint=true, openWorldHint=true, destructiveHint=false |
write | Creates or updates state, but is not destructive by product semantics. | readOnlyHint=false, destructiveHint=false |
destructive | Deletes, revokes, overwrites, triggers irreversible business action, or has high blast radius. | readOnlyHint=false, destructiveHint=true |
The v1 hard rule is:
side_effects=destructive -> requires_human_approval=true
If this rule is violated, v1 validators should emit an error when llm is present. Missing llm metadata on an MCP-exposed endpoint is only a warning in v1.
Fields intentionally deferred to v2β
The following fields are useful but excluded from v1 to keep UAC short:
sensitive_datado_not_use_whenpermissionsrate_limit_policyapproval_policytest_generation_hints
These should not be smuggled into summary or intent. If an operation needs complex governance in v1, the UAC should reference existing policy/classification mechanisms rather than embedding a long prompt-like block.
LLM-ready definitionβ
In v1, an endpoint is LLM-ready if:
- it has an
llmblock with all v1 fields; summaryandintentare compact enough to fit in tool discovery responses;tool_nameis stable and unique inside the tenant/API namespace;side_effectsis explicit;- destructive operations require human approval;
- examples validate against
input_schemawhere an input schema exists; - examples do not contain secrets or real customer data.
6. Runtime implicationsβ
V1 must not introduce a broad runtime refactor.
Runtime implications are limited to validation and projection:
- The existing UAC runtime contract remains valid.
- Existing endpoint routing, backend URL handling, classification, and policies remain unchanged.
llmis optional in v1 and can be ignored by older runtime components.- Validators may warn when an endpoint with MCP exposure has no
llmblock. - Validators should error when a present
llmblock is malformed. - Validators should error when
side_effects=destructiveandrequires_human_approval=false. - Validators should warn when HTTP-method-derived behavior conflicts with
llm.side_effects.
The runtime should treat UAC LLM metadata as an executable hint surface, not as a replacement for authorization, OPA, RBAC, or approval enforcement.
7. MCP implicationsβ
MCP tools are agent-facing projections of UAC operations.
When llm metadata is present, MCP generation should prefer it over purely mechanical defaults:
| MCP field / behavior | Source in v1 |
|---|---|
| Tool name | llm.tool_name, with legacy fallback to operation_id / generated name during transition. |
| Tool description | Compact composition of llm.summary and llm.intent. |
readOnlyHint | Derived from llm.side_effects. |
destructiveHint | true when llm.side_effects=destructive. |
| Approval UX / policy hint | llm.requires_human_approval. |
| Agent allow-list hint | llm.safe_for_agents. |
| Smoke examples | llm.examples. |
The current gateway already derives MCP annotations from UAC actions and HTTP methods. This ADR does not discard that behavior. V1 adds explicit product intent so generated tools are not only syntactically valid, but also understandable by LLM clients.
During adoption, name behavior must remain backward compatible. Existing generated tool names may keep their current format until a contract is explicitly migrated. New LLM-ready endpoints should use llm.tool_name as their stable canonical name.
Transport-specific tool name constraints are adapter concerns. If a client rejects characters such as :, the adapter may map the canonical name to a transport-safe alias, but scoring, logging, catalog references, and smoke assertions should preserve the canonical identity.
8. Smoke test implicationsβ
UAC metadata is not proof that a feature works. It is a declared contract.
Smoke tests provide the runtime proof:
UAC describes.
Smoke proves.
For LLM-ready endpoints, smoke tests should verify at least:
- the UAC validates structurally;
- the MCP projection exists in
tools/listwith the expectedtool_name; - the MCP description includes the intended compact summary/intent;
- the MCP annotations match
side_effects; examples[].inputvalidates against the input schema;- safe read-only examples can be invoked in a deterministic smoke environment;
- write/destructive examples are not blindly executed in smoke;
- destructive tools surface approval or policy gating instead of silent execution.
This keeps smoke tests practical. V1 should not require full business scenario replay for every endpoint. It should prove discoverability, schema compatibility, safe invocation for safe reads, and gating for risky operations.
9. PR / AI development implicationsβ
AI-assisted development should treat UAC as the product boundary for agent-exposed behavior.
V1 PR rule:
- If a PR adds or changes an MCP-exposed endpoint, it should add or update
endpoint.llmmetadata. - If the metadata is missing, review and CI may warn, but should not block all existing endpoints in v1.
- If
endpoint.llmis present and invalid, CI should fail. - If a PR marks an operation as
destructive, it must setrequires_human_approval=true. - If a PR exposes a feature only in code without an associated UAC operation or flow contract, reviewers should treat that as a contract gap.
V2 PR rule, once adoption is proven:
- New endpoints exposed through MCP must be LLM-ready before merge.
- Legacy endpoints may be grandfathered until touched.
This supports bounded AI PRs. A PR generated by an AI agent should be traceable to a contract operation, examples, and smoke proof rather than relying on code-only intent.
10. Consequencesβ
Positiveβ
- Agents receive clearer tool descriptions and are less likely to choose the wrong operation.
- MCP generation gains a stable operation-level source for tool naming and descriptions.
- Smoke tests can reuse examples instead of inventing separate fixtures.
- Destructive operations become visible in the contract before runtime incidents.
- Human reviewers get a compact place to verify product intent.
- STOA gains a guardrail against AI-assisted rewrites that are valid code but invalid product behavior.
Negativeβ
- Endpoint authors must maintain a few extra fields.
- Metadata can drift if not checked by validators and smoke tests.
- The word
safe_for_agentscan create false confidence if runtime policies are weak. - Tool naming migration may reveal inconsistencies between existing generated names, client transport constraints, and canonical names.
Mitigationsβ
- Keep v1 warning / recommended for missing metadata.
- Fail only malformed metadata when present.
- Limit examples to one or two compact cases.
- Defer sensitive data, permissions, rate limits, and approval policy to v2.
- Add smoke proof before making the field mandatory.
11. Risksβ
UAC bloatβ
Risk: teams turn llm into long prompt text.
Mitigation: field length guidance, examples cap, v2 fields deferred, no embedded policy prose.
False safetyβ
Risk: safe_for_agents=true is mistaken for authorization.
Mitigation: document that runtime auth, RBAC, OPA, and approval gates remain authoritative.
Drift between UAC and MCP generationβ
Risk: llm.tool_name differs from generated or discovered tool names.
Mitigation: smoke tests assert discovery by expected canonical name.
Over-migrationβ
Risk: making v1 mandatory immediately creates a large cleanup project.
Mitigation: warning mode in v1, mandatory only for new MCP endpoints in v2.
Sensitive examplesβ
Risk: examples accidentally contain real customer data or secrets.
Mitigation: validators and review checklist reject secrets and require synthetic examples.
12. Alternatives consideredβ
A. Keep using OpenAPI descriptions onlyβ
Rejected for v1. OpenAPI descriptions can help humans, but they do not consistently encode agent intent, side effects, approval requirements, and smoke examples.
B. Add a separate agent manifest outside UACβ
Rejected. A separate manifest would create a second source of truth and drift from the runtime contract.
C. Put metadata at contract level onlyβ
Rejected. Side effects, examples, and agent safety are operation-level properties. Contract-level metadata is too coarse for MCP tool generation.
D. Add full AI governance fields immediatelyβ
Rejected. Fields such as permissions, approval_policy, sensitive_data, and test_generation_hints are useful, but adding them now would turn the UAC into a large policy document before the basic loop is proven.
E. Do nothingβ
Rejected. Existing generated MCP tools can be syntactically correct while still being unclear or unsafe for agents.
13. Adoption planβ
Phase 0 β ADR onlyβ
Accept this ADR and align the team on endpoint-level LLM metadata.
Phase 1 β Schema and validator in warning modeβ
Allow optional endpoints[].llm in the UAC schema.
Warn when a published MCP-exposed endpoint has no llm metadata. Fail malformed llm blocks when present.
Phase 2 β Apply to one or two low-risk contractsβ
Migrate a small demo or smoke contract first, preferably read-only operations.
Avoid broad migration of all existing UACs.
Phase 3 β MCP generator consumes v1 fieldsβ
Use summary, intent, tool_name, and side_effects during MCP tool generation when available. Preserve legacy fallback behavior.
Phase 4 β Smoke test integrationβ
Extend smoke tests to verify discovery, annotations, example schema validity, safe read invocation, and destructive gating.
Phase 5 β V2 enforcement for new MCP endpointsβ
Once the loop is proven, make LLM metadata mandatory for new endpoints exposed through MCP. Legacy endpoints remain grandfathered until modified.
14. Open questionsβ
- Should
llm.tool_namebe the canonical stored name, or should it remain an alias over the existingtenant:contract:operationnaming pattern? - Which component owns transport-safe name mapping when a client rejects characters accepted by STOA canonical names?
- Should
safe_for_agents=falsehide a tool from discovery, or expose it with a policy/approval marker? - Where should
requires_human_approvalbe enforced first: MCP client UX, gateway policy, Control Plane approval workflow, or all three progressively? - Should write operations that move money or trigger external business commitments be classified as
writeordestructiveby default? - When v2 starts, should enforcement be tied to
status=published, MCP binding enabled, or both? - Should contract-level LLM defaults ever be allowed, or would that reintroduce inheritance and ambiguity rejected by ADR-022?