04 — Design Patterns & Notable Decisions
1. "One source, two wrappers"
Pattern: the same canonical artifact (a system prompt, a skill body) is referenced by two delivery wrappers — a Cowork plugin and a Managed-Agent cookbook — instead of duplicated.
Where:
- System prompts:
managed-agent-cookbooks/<slug>/agent.yaml:6-7references../../plugins/agent-plugins/<slug>/agents/<slug>.mdviasystem: { file: ... }. - Skill bundles: same yaml's
skills: [{from_plugin: ../../plugins/agent-plugins/<slug>}](agent.yaml:24-25) — at deploy time the script expands this to one{path: ...}per skill (deploy-managed-agent.sh:95-105). - Subagent skills: explicit
{path: ../../../plugins/agent-plugins/<slug>/skills/<skill>}(subagents/deck-writer.yaml:17-19).
Why it matters: the alternative is two parallel directory trees that drift. The drift would be silent — the agent runs, just with stale guidance. scripts/check.py makes drift a CI failure (check.py:114-131).
Trade-off: introduces a vendoring step (scripts/sync-agent-skills.py) — agent plugins bundle a synced copy of each skill so they're self-contained for Cowork install, but the source remains in vertical-plugins/. This is the price of "self-contained Cowork plugin" + "single source of truth" + "no symlinks-in-zip pitfalls."
2. Untrusted-reader / re-verifier / write-only-resolver isolation tiers
Pattern: every managed-agent cookbook that touches outsider documents splits responsibilities across at least three subagents with non-overlapping tool sets. Documented per-agent in the README "Security & handoffs" table.
Canonical example (managed-agent-cookbooks/gl-reconciler/README.md:24-31):
| Tier | Touches untrusted docs? | Tools | Connectors |
| reader | YES | Read, Grep only | None |
| Orchestrator | No | Read, Grep, Glob, Agent | Read-only MCPs |
| resolver (Write) | No | Read, Write, Edit | None |
Invariants:
- Untrusted readers have no MCP, no Bash, no Write — only
Read/Grep. Their output is constrained byoutput_schema:(subagents/reader.yaml:35-58). - The write-holder (
resolver,escalator,note-writer,pack-writer,flagger,poster,deck-writer) has Read/Write/Edit but no MCPs and no untrusted document access. - A separate
criticre-verifies break-style claims against trusted MCPs before the resolver consumes the verified set (subagents/critic.yaml:3-8).
Why this works: it's structural defense-in-depth, not behavioral. A prompt-injection payload in a custodian PDF could convince the reader to do anything — but the reader has no tools that matter. The orchestrator only sees length-capped, character-class-restricted JSON (subagents/reader.yaml:40-58). By the time data reaches the write-holder, it's already been (a) schema-trimmed, (b) re-verified against trusted sources, and (c) handed across an agent boundary that the model cannot cross by itself.
The schema constraint is the load-bearing piece. Look at the regex in subagents/reader.yaml:50-58:
account: { type: string, maxLength: 64, pattern: "^[A-Za-z0-9._:-]+$" }
evidence_refs:
items: { type: string, maxLength: 256, pattern: "^[A-Za-z0-9 ._/:#-]+$" }An attacker's English-language injection cannot survive this filter — the character class excludes spaces (in account), parens, and the punctuation needed for natural-language instructions. Schema-validation thereby acts as a structural sanitizer, not just a contract.
scripts/validate.py:14-37 is the runtime enforcer (the API doesn't enforce this for you), and the deploy script keeps output_schema out of the deployed body (deploy-managed-agent.sh:154 del(.output_schema)) because it's a harness-side concern.
3. Depth-1 enforcement
Pattern: managed-agents are flat. An orchestrator can dispatch leaf workers, but workers cannot dispatch their own subagents.
Where enforced: scripts/test-cookbooks.sh:14-17 — a Python check on each dry-run output:
for i,x in enumerate(b):
if i<len(b)-1 and x.get('callable_agents'): errs.append(f'{x.get("name")}: depth>1 (subagent has callable_agents)')Why: depth-2 hierarchies make the security perimeter unreasonable to audit. Each additional hop multiplies the attack surface for prompt injection and gives an attacker more chances to land a Write-equipped tool. Depth-1 also matches the way the Managed Agents preview surface is intended to be used.
4. Handoff-as-text + allowlisted router
Pattern: instead of a built-in tool call, cross-agent handoff is a JSON blob the orchestrator emits in its message text. An out-of-band watcher (scripts/orchestrate.py) parses, validates, and routes it.
Why text instead of a tool call? The Managed Agents preview doesn't ship a first-class handoff primitive yet. Text is a portable shim that any workflow engine can consume.
The known weakness is documented in the script itself (orchestrate.py:8-15):
"""
Security note: handoff requests are surfaced in the orchestrator's text output,
which is downstream of untrusted-document readers. An attacker who controls a
processed document could embed a literal handoff_request blob that, if echoed,
would be parsed here. This script mitigates by (a) hard-allowlisting
target_agent against the deployed slugs and (b) schema-validating the payload
before steering. In production, prefer emitting handoffs via a dedicated tool
call or a typed SSE event the model cannot produce by quoting document text.
"""This is rare and good — the scaffolding is honest about its own limitations and points the user at the better solution.
Mitigations applied:
ALLOWED_TARGETSallowlist (orchestrate.py:23-27) — even if a malicious document forges ahandoff_request, it can only target one of the 10 deployed slugs.HANDOFF_PAYLOAD_SCHEMA(orchestrate.py:29-38) —eventcapped at 2000 chars,context_refrestricted to^[A-Za-z0-9 ._/:#-]+$.- Silent drop on failure (
return Noneinextract_handoff). The agent text still goes to the user; only the routing skips.
5. Skill auto-discovery via description
Pattern: skills don't have to be invoked explicitly — Claude fires them when the conversation matches the skill's description field.
Where: every SKILL.md opens with:
---
name: dcf-model
description: Real DCF (Discounted Cash Flow) model creation for equity valuation. Retrieves financial data from SEC filings... Triggers on "DCF", "intrinsic value", "comprehensive valuation"...
---(plugins/agent-plugins/pitch-agent/skills/dcf-model/SKILL.md:1-3)
Why: the description doubles as a routing prompt. The agent decides "the user said 'DCF', the dcf-model skill matches" and loads the body. This is simpler than maintaining an explicit skill router but means skill descriptions are de facto code — vague descriptions = unpredictable triggering.
Notice the skill pattern in CIM-builder (plugins/vertical-plugins/investment-banking/skills/cim-builder/SKILL.md:3): "Triggers on 'CIM', 'confidential information memorandum', 'offering memorandum', 'info memo', 'draft CIM', or 'sell-side materials'." The description explicitly lists trigger keywords because the model's matching is description-text-driven.
6. System prompts as workflow contracts
Pattern: every agent system prompt has the same five sections:
- What you produce — explicit list of artifacts.
- Workflow — numbered steps that name the skills used at each step.
- Guardrails — what the agent must not do (publish, post, send email, decide).
- Skills this agent uses — back-tick-quoted list.
- (Implicit frontmatter) —
tools:allowlist of tool families.
Compare the openings of gl-reconciler.md, pitch-agent.md, model-builder.md, kyc-screener.md, earnings-reviewer.md — same skeleton, different domain.
The skill list at the bottom is mechanically checked: check.py:134-143 regex-extracts every back-tick-quoted skill from the agent prompt and verifies it's bundled in the agent plugin's skills/. This means the prompt's narrative claims about which skills exist are kept honest in CI.
7. "Stop and surface for review" as a hard prompt rule
Pattern: every agent's Guardrails section includes some variant of "stop and surface for review."
| Agent | Guardrail wording | Source |
|---|---|---|
| pitch-agent | "Stop and surface for review after the Excel model is built and again after the deck is generated. The banker approves each artifact before you proceed to the next." | pitch-agent.md:32 |
| earnings-reviewer | "Surface for review. Stage the model and note as drafts. Do not publish externally." | earnings-reviewer.md:24 |
| model-builder | "Stop and surface after build and again after audit. The user approves before sensitivities." | model-builder.md:30 |
| gl-reconciler | "No ledger posting. This agent produces a report; ledger adjustments require human approval outside the agent." | gl-reconciler.md:29 |
| kyc-screener | "No risk-rating decision. This agent recommends; the compliance officer decides." | kyc-screener.md:27 |
This is the regulatory/compliance design of the repo encoded in prose: every agent has at least one explicit "you don't decide" rule.
8. CSV-as-pseudocode in skill prompts
Pattern: spreadsheet-building skills describe their target Excel layout as CSV-shaped tables in the prompt itself. Example from dcf-model/SKILL.md:912-960:
Income Statement ($M),2020A,2021A,2022A,2023A,2024E,2025E,2026E
Revenue,XXX,XXX,XXX,XXX,[=E29*(1+$E$10)],[=F29*(1+$E$11)],[=G29*(1+$E$12)]
% growth,XX%,XX%,XX%,XX%,[=E29/D29-1],[=F29/E29-1],[=G29/F29-1]Why: CSV is already linear text Claude can pattern-match against, and it's directly transcribable to openpyxl ws["E29"] = "=..." writes. The bracketed [=formula] notation marks "write a formula here", not "compute this Python-side and paste the value" — which the prompt enforces in capitals throughout (SKILL.md:44-49):
Formulas Over Hardcodes (NON-NEGOTIABLE): Every projection, margin, discount factor, PV, and sensitivity cell MUST be a live Excel formula — never a value computed in Python and written as a number.
9. Common-mistakes section as inverted prompting
Pattern: skill bodies include a <common_mistakes> section listing wrong patterns alongside <correct_patterns>. Both are wrapped in pseudo-XML tags.
Example: dcf-model/SKILL.md:368-756:
<correct_patterns>
This section contains all the CORRECT patterns to follow when building DCF models.
...
</correct_patterns>
<common_mistakes>
This section contains all the WRONG patterns to avoid when building DCF models.
### WRONG: Simplified Sensitivity Table Approximations or Placeholder Text
Don't use linear approximations:
// WRONG - Linear approximation B97: =B88*(1+(0.096-0.116)) // Assumes linear relationship
Don't leave placeholder text:
// WRONG - Placeholder note "Note: Use Excel Data Table feature..."
...
</common_mistakes>
This is unusual: the prompt actively contains anti-examples it wants the model to recognize and not emit. The XML tags make the section retrievable as a unit (Claude can refer to "see <common_mistakes>") and the WRONG/INSTEAD pairing is explicit.
The DCF skill goes further with a TOP 5 ERRORS SUMMARY at line 717, then "Re-read this section before starting any DCF build" at 754 — the prompt is, in effect, asking the model to perform a self-review against a known failure list.
10. Local override files
Pattern: verticals ship a .local.md.example that the user copies and customizes; the real file is gitignored.
Example: plugins/vertical-plugins/investment-banking/.claude/investment-banking.local.md.example carries the user's name, title, sectors, active deals, valuation defaults — and a free-text "Notes" section for market themes / relationships / precedent transactions.
Why: these files give the agent firm-specific context (client names, deal codes, branding) without committing it to a public marketplace repo. CLAUDE.md confirms *.local.md is gitignored.
11. Manifest convenience syntax + harness-side normalization
The agent.yaml files use convenience fields the API does not understand:
| Convenience field | What the script does | Source |
|---|---|---|
system: {file: ...} |
Read file, inline as system: <string> |
deploy-managed-agent.sh:114-130 |
system: {append: ...} |
Append text after the file body | same |
skills: [{path: ...}] |
Zip directory, POST to /v1/skills, replace with {type:custom, skill_id, version} |
deploy-managed-agent.sh:54-86 |
skills: [{from_plugin: ...}] |
Expand to one {path:} per skill in that plugin |
deploy-managed-agent.sh:95-105 |
callable_agents: [{manifest: ...}] |
Recurse: deploy subagent first, replace with {type:agent, id, version} |
deploy-managed-agent.sh:147-153 |
output_schema: (subagent) |
Strip from POST body; consumed by scripts/validate.py instead |
deploy-managed-agent.sh:154 |
${ENV_VAR} in any string |
Substitute from env after character-class check | deploy-managed-agent.sh:36-52 |
This keeps the cookbook readable (relative paths, no skill IDs) while still producing a valid POST body.
12. Deliberate non-features
Worth noting what the repo decided not to do:
- No telemetry. Nothing phones home.
- No package.json / requirements.txt / Makefile at root.
justfileexists but only loads dev-container helpers. - No build step. Markdown and JSON only.
- No DSL. The "agent definition" is plain markdown plus YAML frontmatter — anyone can read or edit it without learning a new format.
- Hooks scaffolded but empty.
hooks/hooks.jsonis[]or{}. The hooks system is wired into Cowork but not used here yet. - No unit tests. Validation is structural (
check.py,test-cookbooks.sh); behaviour testing is left to the firm. - No "Anthropic-internal" anything. A CI check actively blocks references to internal Anthropic infra (
.github/workflows/secret-scan.yml:25-32).
13. Tradeoffs (where I'd push back)
- Bundled-skill drift is a footgun — every PR that edits a skill in
vertical-plugins/must remember to runsync-agent-skills.py. CI catches it, but it's friction; a pre-commit hook would help. - Handoff-as-text vs handoff-as-tool-call — the script's own header documents this. Until the platform exposes a typed handoff primitive, the regex-extract path is the load-bearing weakness.
- No first-class hook usage —
hooks/hooks.jsonstubs are present but empty. This is a missed opportunity for, e.g., auto-runningpython recalc.py model.xlsxafter the model-builder agent emits a.xlsx(a hook onStopwould catch it). - Skill triggering is descripton-string-driven — vague descriptions = unpredictable behavior. The fact that CIM-builder's description literally lists keywords ("Triggers on 'CIM', 'confidential information memorandum', ...") is a tell that this routing is fragile.
ALLOWED_TARGETSis hand-maintained inorchestrate.py:23-27, separate frommarketplace.json. Easy to drift;check.pydoesn't yet cross-check it.- System prompts assume
model: claude-opus-4-7everywhere (every cookbook). Picking the right model per leaf is left to the firm. A leaf reader doing JSON extraction is wasted on Opus; Haiku/Sonnet would be cheaper and just as good.