04 — Design Patterns & Notable Decisions

1. "One source, two wrappers"

Pattern: the same canonical artifact (a system prompt, a skill body) is referenced by two delivery wrappers — a Cowork plugin and a Managed-Agent cookbook — instead of duplicated.

Where:

System prompts: managed-agent-cookbooks/<slug>/agent.yaml:6-7 references ../../plugins/agent-plugins/<slug>/agents/<slug>.md via system: { file: ... }.
Skill bundles: same yaml's skills: [{from_plugin: ../../plugins/agent-plugins/<slug>}] (agent.yaml:24-25) — at deploy time the script expands this to one {path: ...} per skill (deploy-managed-agent.sh:95-105).
Subagent skills: explicit {path: ../../../plugins/agent-plugins/<slug>/skills/<skill>} (subagents/deck-writer.yaml:17-19).

Why it matters: the alternative is two parallel directory trees that drift. The drift would be silent — the agent runs, just with stale guidance. scripts/check.py makes drift a CI failure (check.py:114-131).

Trade-off: introduces a vendoring step (scripts/sync-agent-skills.py) — agent plugins bundle a synced copy of each skill so they're self-contained for Cowork install, but the source remains in vertical-plugins/. This is the price of "self-contained Cowork plugin" + "single source of truth" + "no symlinks-in-zip pitfalls."

2. Untrusted-reader / re-verifier / write-only-resolver isolation tiers

Pattern: every managed-agent cookbook that touches outsider documents splits responsibilities across at least three subagents with non-overlapping tool sets. Documented per-agent in the README "Security & handoffs" table.

Canonical example (managed-agent-cookbooks/gl-reconciler/README.md:24-31):

| Tier              | Touches untrusted docs? | Tools                           | Connectors        |
| reader            | YES                     | Read, Grep only                 | None              |
| Orchestrator      | No                      | Read, Grep, Glob, Agent         | Read-only MCPs    |
| resolver (Write)  | No                      | Read, Write, Edit               | None              |

Invariants:

Untrusted readers have no MCP, no Bash, no Write — only Read/Grep. Their output is constrained by output_schema: (subagents/reader.yaml:35-58).
The write-holder (resolver, escalator, note-writer, pack-writer, flagger, poster, deck-writer) has Read/Write/Edit but no MCPs and no untrusted document access.
A separate critic re-verifies break-style claims against trusted MCPs before the resolver consumes the verified set (subagents/critic.yaml:3-8).

Why this works: it's structural defense-in-depth, not behavioral. A prompt-injection payload in a custodian PDF could convince the reader to do anything — but the reader has no tools that matter. The orchestrator only sees length-capped, character-class-restricted JSON (subagents/reader.yaml:40-58). By the time data reaches the write-holder, it's already been (a) schema-trimmed, (b) re-verified against trusted sources, and (c) handed across an agent boundary that the model cannot cross by itself.

The schema constraint is the load-bearing piece. Look at the regex in subagents/reader.yaml:50-58:

account:        { type: string, maxLength: 64,  pattern: "^[A-Za-z0-9._:-]+$" }
evidence_refs:
  items: { type: string, maxLength: 256, pattern: "^[A-Za-z0-9 ._/:#-]+$" }

An attacker's English-language injection cannot survive this filter — the character class excludes spaces (in account), parens, and the punctuation needed for natural-language instructions. Schema-validation thereby acts as a structural sanitizer, not just a contract.

scripts/validate.py:14-37 is the runtime enforcer (the API doesn't enforce this for you), and the deploy script keeps output_schema out of the deployed body (deploy-managed-agent.sh:154 del(.output_schema)) because it's a harness-side concern.

3. Depth-1 enforcement

Pattern: managed-agents are flat. An orchestrator can dispatch leaf workers, but workers cannot dispatch their own subagents.

Where enforced: scripts/test-cookbooks.sh:14-17 — a Python check on each dry-run output:

for i,x in enumerate(b):
    if i<len(b)-1 and x.get('callable_agents'): errs.append(f'{x.get("name")}: depth>1 (subagent has callable_agents)')

Why: depth-2 hierarchies make the security perimeter unreasonable to audit. Each additional hop multiplies the attack surface for prompt injection and gives an attacker more chances to land a Write-equipped tool. Depth-1 also matches the way the Managed Agents preview surface is intended to be used.

4. Handoff-as-text + allowlisted router

Pattern: instead of a built-in tool call, cross-agent handoff is a JSON blob the orchestrator emits in its message text. An out-of-band watcher (scripts/orchestrate.py) parses, validates, and routes it.

Why text instead of a tool call? The Managed Agents preview doesn't ship a first-class handoff primitive yet. Text is a portable shim that any workflow engine can consume.

The known weakness is documented in the script itself (orchestrate.py:8-15):

"""
Security note: handoff requests are surfaced in the orchestrator's text output,
which is downstream of untrusted-document readers. An attacker who controls a
processed document could embed a literal handoff_request blob that, if echoed,
would be parsed here. This script mitigates by (a) hard-allowlisting
target_agent against the deployed slugs and (b) schema-validating the payload
before steering. In production, prefer emitting handoffs via a dedicated tool
call or a typed SSE event the model cannot produce by quoting document text.
"""

This is rare and good — the scaffolding is honest about its own limitations and points the user at the better solution.

Mitigations applied:

ALLOWED_TARGETS allowlist (orchestrate.py:23-27) — even if a malicious document forges a handoff_request, it can only target one of the 10 deployed slugs.
HANDOFF_PAYLOAD_SCHEMA (orchestrate.py:29-38) — event capped at 2000 chars, context_ref restricted to ^[A-Za-z0-9 ._/:#-]+$.
Silent drop on failure (return None in extract_handoff). The agent text still goes to the user; only the routing skips.

5. Skill auto-discovery via description

Pattern: skills don't have to be invoked explicitly — Claude fires them when the conversation matches the skill's description field.

Where: every SKILL.md opens with:

---
name: dcf-model
description: Real DCF (Discounted Cash Flow) model creation for equity valuation. Retrieves financial data from SEC filings... Triggers on "DCF", "intrinsic value", "comprehensive valuation"...
---

(plugins/agent-plugins/pitch-agent/skills/dcf-model/SKILL.md:1-3)

Why: the description doubles as a routing prompt. The agent decides "the user said 'DCF', the dcf-model skill matches" and loads the body. This is simpler than maintaining an explicit skill router but means skill descriptions are de facto code — vague descriptions = unpredictable triggering.

Notice the skill pattern in CIM-builder (plugins/vertical-plugins/investment-banking/skills/cim-builder/SKILL.md:3): "Triggers on 'CIM', 'confidential information memorandum', 'offering memorandum', 'info memo', 'draft CIM', or 'sell-side materials'." The description explicitly lists trigger keywords because the model's matching is description-text-driven.

6. System prompts as workflow contracts

Pattern: every agent system prompt has the same five sections:

What you produce — explicit list of artifacts.
Workflow — numbered steps that name the skills used at each step.
Guardrails — what the agent must not do (publish, post, send email, decide).
Skills this agent uses — back-tick-quoted list.
(Implicit frontmatter) — tools: allowlist of tool families.

Compare the openings of gl-reconciler.md, pitch-agent.md, model-builder.md, kyc-screener.md, earnings-reviewer.md — same skeleton, different domain.

The skill list at the bottom is mechanically checked: check.py:134-143 regex-extracts every back-tick-quoted skill from the agent prompt and verifies it's bundled in the agent plugin's skills/. This means the prompt's narrative claims about which skills exist are kept honest in CI.

7. "Stop and surface for review" as a hard prompt rule

Pattern: every agent's Guardrails section includes some variant of "stop and surface for review."

Agent	Guardrail wording	Source
pitch-agent	"Stop and surface for review after the Excel model is built and again after the deck is generated. The banker approves each artifact before you proceed to the next."	`pitch-agent.md:32`
earnings-reviewer	"Surface for review. Stage the model and note as drafts. Do not publish externally."	`earnings-reviewer.md:24`
model-builder	"Stop and surface after build and again after audit. The user approves before sensitivities."	`model-builder.md:30`
gl-reconciler	"No ledger posting. This agent produces a report; ledger adjustments require human approval outside the agent."	`gl-reconciler.md:29`
kyc-screener	"No risk-rating decision. This agent recommends; the compliance officer decides."	`kyc-screener.md:27`

This is the regulatory/compliance design of the repo encoded in prose: every agent has at least one explicit "you don't decide" rule.

8. CSV-as-pseudocode in skill prompts

Pattern: spreadsheet-building skills describe their target Excel layout as CSV-shaped tables in the prompt itself. Example from dcf-model/SKILL.md:912-960:

Income Statement ($M),2020A,2021A,2022A,2023A,2024E,2025E,2026E
Revenue,XXX,XXX,XXX,XXX,[=E29*(1+$E$10)],[=F29*(1+$E$11)],[=G29*(1+$E$12)]
  % growth,XX%,XX%,XX%,XX%,[=E29/D29-1],[=F29/E29-1],[=G29/F29-1]

Why: CSV is already linear text Claude can pattern-match against, and it's directly transcribable to openpyxl ws["E29"] = "=..." writes. The bracketed [=formula] notation marks "write a formula here", not "compute this Python-side and paste the value" — which the prompt enforces in capitals throughout (SKILL.md:44-49):

Formulas Over Hardcodes (NON-NEGOTIABLE): Every projection, margin, discount factor, PV, and sensitivity cell MUST be a live Excel formula — never a value computed in Python and written as a number.

9. Common-mistakes section as inverted prompting

Pattern: skill bodies include a <common_mistakes> section listing wrong patterns alongside <correct_patterns>. Both are wrapped in pseudo-XML tags.

Example: dcf-model/SKILL.md:368-756:

<correct_patterns>
This section contains all the CORRECT patterns to follow when building DCF models.
...
</correct_patterns>

<common_mistakes>
This section contains all the WRONG patterns to avoid when building DCF models.

### WRONG: Simplified Sensitivity Table Approximations or Placeholder Text

Don't use linear approximations:

// WRONG - Linear approximation B97: =B88*(1+(0.096-0.116)) // Assumes linear relationship


Don't leave placeholder text:

// WRONG - Placeholder note "Note: Use Excel Data Table feature..."

...
</common_mistakes>

This is unusual: the prompt actively contains anti-examples it wants the model to recognize and not emit. The XML tags make the section retrievable as a unit (Claude can refer to "see <common_mistakes>") and the WRONG/INSTEAD pairing is explicit.

The DCF skill goes further with a TOP 5 ERRORS SUMMARY at line 717, then "Re-read this section before starting any DCF build" at 754 — the prompt is, in effect, asking the model to perform a self-review against a known failure list.

10. Local override files

Pattern: verticals ship a .local.md.example that the user copies and customizes; the real file is gitignored.

Example: plugins/vertical-plugins/investment-banking/.claude/investment-banking.local.md.example carries the user's name, title, sectors, active deals, valuation defaults — and a free-text "Notes" section for market themes / relationships / precedent transactions.

Why: these files give the agent firm-specific context (client names, deal codes, branding) without committing it to a public marketplace repo. CLAUDE.md confirms *.local.md is gitignored.

11. Manifest convenience syntax + harness-side normalization

The agent.yaml files use convenience fields the API does not understand:

Convenience field	What the script does	Source
`system: {file: ...}`	Read file, inline as `system: <string>`	`deploy-managed-agent.sh:114-130`
`system: {append: ...}`	Append text after the file body	same
`skills: [{path: ...}]`	Zip directory, POST to `/v1/skills`, replace with `{type:custom, skill_id, version}`	`deploy-managed-agent.sh:54-86`
`skills: [{from_plugin: ...}]`	Expand to one `{path:}` per skill in that plugin	`deploy-managed-agent.sh:95-105`
`callable_agents: [{manifest: ...}]`	Recurse: deploy subagent first, replace with `{type:agent, id, version}`	`deploy-managed-agent.sh:147-153`
`output_schema:` (subagent)	Strip from POST body; consumed by `scripts/validate.py` instead	`deploy-managed-agent.sh:154`
`${ENV_VAR}` in any string	Substitute from env after character-class check	`deploy-managed-agent.sh:36-52`

This keeps the cookbook readable (relative paths, no skill IDs) while still producing a valid POST body.

12. Deliberate non-features

Worth noting what the repo decided not to do:

No telemetry. Nothing phones home.
No package.json / requirements.txt / Makefile at root. justfile exists but only loads dev-container helpers.
No build step. Markdown and JSON only.
No DSL. The "agent definition" is plain markdown plus YAML frontmatter — anyone can read or edit it without learning a new format.
Hooks scaffolded but empty. hooks/hooks.json is [] or {}. The hooks system is wired into Cowork but not used here yet.
No unit tests. Validation is structural (check.py, test-cookbooks.sh); behaviour testing is left to the firm.
No "Anthropic-internal" anything. A CI check actively blocks references to internal Anthropic infra (.github/workflows/secret-scan.yml:25-32).

13. Tradeoffs (where I'd push back)

Bundled-skill drift is a footgun — every PR that edits a skill in vertical-plugins/ must remember to run sync-agent-skills.py. CI catches it, but it's friction; a pre-commit hook would help.
Handoff-as-text vs handoff-as-tool-call — the script's own header documents this. Until the platform exposes a typed handoff primitive, the regex-extract path is the load-bearing weakness.
No first-class hook usage — hooks/hooks.json stubs are present but empty. This is a missed opportunity for, e.g., auto-running python recalc.py model.xlsx after the model-builder agent emits a .xlsx (a hook on Stop would catch it).
Skill triggering is descripton-string-driven — vague descriptions = unpredictable behavior. The fact that CIM-builder's description literally lists keywords ("Triggers on 'CIM', 'confidential information memorandum', ...") is a tell that this routing is fragile.
ALLOWED_TARGETS is hand-maintained in orchestrate.py:23-27, separate from marketplace.json. Easy to drift; check.py doesn't yet cross-check it.
System prompts assume model: claude-opus-4-7 everywhere (every cookbook). Picking the right model per leaf is left to the firm. A leaf reader doing JSON extraction is wasted on Opus; Haiku/Sonnet would be cheaper and just as good.