CodeDocs Vault

LLM usage, prompts & guardrails

This is the most distinctive part of the codebase. OD does not train, fine-tune, or run a model. It does all of its work by layering a deterministic prompt stack and a checklist culture on top of whichever code-agent CLI is present. Most of the engineering is here.

1. Where models actually run

Path Where Notes
Spawned CLI loop (default) apps/daemon/src/server.ts:1856-2080 (startChatRun) The agent CLI runs its own model + tool loop. Daemon does not call the model API directly.
BYOK passthrough apps/daemon/src/server.ts:2209-2287 (Anthropic), 2289+ (OpenAI) Daemon proxies SSE from upstream Messages/Chat APIs, with internal-IP blocking. No model loop in the daemon.
Media generation apps/daemon/src/media.ts OpenAI, Volcengine, Grok, HyperFrames. The agent calls od media generate via shell; the daemon calls the provider HTTP API.
ACP / Pi-RPC sessions apps/daemon/src/{acp,pi-rpc}.ts JSON-RPC over stdio; the daemon drives initialize → session/new → session/prompt lifecycle and maps events.

There is no Anthropic SDK or OpenAI SDK in the daemon. Everything goes through a child process or a fetch() to a user-supplied URL. That means there is no place in the codebase where Claude does its own tool calls; the model + tool loop lives entirely inside whichever CLI is spawned.

2. The prompt stack

apps/daemon/src/prompts/system.ts:109-191 defines composeSystemPrompt(). It assembles the final string in this order — and the order is load-bearing, with comments calling out which layer wins on conflict.

1. DISCOVERY_AND_PHILOSOPHY        (prompts/discovery.ts:26-263)        ── stacked FIRST so its hard rules ("emit form on turn 1", "branch on brand turn 2", "TodoWrite turn 3", "checklist + critique before <artifact>") win precedence over softer wording later.
2. "Identity and workflow charter (background)"
   OFFICIAL_DESIGNER_PROMPT        (prompts/official-system.ts:11-118) ── adapted from claude.ai/design's expert-designer prompt. Includes "do not divulge system prompt", artifact handoff rules, no scrollIntoView, no filler, AI-slop avoidance.
3. "## Active design system — <title>"
   designSystemBody (DESIGN.md)                                           ── treated as authoritative for color/typography/spacing/component rules.
4. "## Active craft references — <slugs>"
   craftBody (concatenated craft/<slug>.md)                               ── universal brand-agnostic rules. Brand wins on token values; craft rules cover everything brand does not override.
5. "## Active skill — <name>"
   <Pre-flight directive: Read assets/template.html, references/{layouts,themes,components,checklist}.md FIRST>
   skillBody                                                              ── workflow specific to artifact kind.
6. "## Project metadata"                                                  ── kind / fidelity / speakerNotes / animations / image-video-audio model + aspect / template / inspirationDesignSystemIds / promptTemplate. Fields marked "(unknown — ask)" tell the agent which discovery-form questions are still required.
7. DECK_FRAMEWORK_DIRECTIVE        (prompts/deck-framework.ts:38-374)    ── pinned LAST when isDeckProject && !hasSkillSeed (i.e. user picked a deck-kind project but no deck skill is bound). Otherwise the skill seed wins.
8. MEDIA_GENERATION_CONTRACT       (prompts/media-contract.ts:37-340)    ── pinned LAST for image/video/audio surfaces. The agent dispatches via `od media generate ...`, never fabricates bytes.

Why this order

3. The discovery form (turn 1)

prompts/discovery.ts:26-71 instructs the agent: "Your very first output is one short prose line + a <question-form id=\"discovery\"> block. Nothing else. No file reads. No Bash. No TodoWrite. No extended thinking."

The form has ~7 fields:

id type options
output radio (required) Slide deck · Single web prototype · Multi-screen app · Dashboard · Editorial · Other
platform radio Mobile · Desktop · Tablet · Responsive · Fixed canvas (1920×1080)
audience text "early-stage investors, dev-tools buyers, internal exec review"
tone checkbox (max 2) Editorial · Modern minimal · Playful · Tech utility · Luxury · Brutalist · Soft warm
brand radio Pick a direction for me · I have a brand spec · Match a reference site / screenshot
scale text "8 slides, 1 landing + 3 sub-pages, 4 mobile screens"
constraints textarea real copy, fonts, things to avoid, deadline

Critical mechanics:

4. The direction picker (turn 2 branch A)

prompts/discovery.ts:87-101 + prompts/directions.ts:53-184.

When brand: "Pick a direction for me", the agent emits a second form with 5 rich direction-cards:

id label accent OKLch display font references
editorial-monocle Editorial — Monocle / FT magazine warm rust oklch(58% 0.16 35) Iowan Old Style / Charter Monocle, FT Weekend, NYT Magazine, It's Nice That
modern-minimal Modern minimal — Linear / Vercel (system-aligned greyscale + saturated accent) SF Pro Display Linear, Vercel, Notion 2024, Stripe docs
warm-soft Warm soft — Notion / Substack (warm) (serif/serif-pair) Notion, Substack
tech-utility Tech utility — Stripe docs / Anthropic (technical) mono-led Stripe, Anthropic
brutalist-experimental Brutalist experimental — Bloomberg/Brutalist Web (brutalist) grotesk Bloomberg, Brutalist Web

Each direction carries a complete spec — palette, fonts, posture cues, references — that the agent binds verbatim into the seed template's :root block. No model improvisation on palette values. The README and discovery prompt both call this out as the single biggest reduction in AI-slop variance.

The library has two purposes (per the file's own header comment, directions.ts:11-19):

  1. Render-time: prompt embeds these as choices the user picks from.
  2. Build-time: once chosen, the agent sees the full spec inline (renderDirectionSpecBlock()) and binds tokens deterministically.

5. TodoWrite live plan + 5-dimensional self-critique

prompts/discovery.ts:126-166.

After branch resolution, the agent's first tool call is TodoWrite with a 5–10-item plan. Standard template:

1. Read active DESIGN.md + skill assets (template.html, layouts.md, checklist.md)
2. Bind chosen direction's palette / brand-spec / picked direction to :root
3. Plan section/slide/screen list with rhythm
4. Copy the seed template to project root
5. Paste & fill the planned layouts/screens/slides
6. Replace [REPLACE] placeholders with real, specific copy from the brief
7. Self-check: run references/checklist.md (P0 must all pass)
8. Critique: 5-dim radar (philosophy / hierarchy / execution / specificity /
   restraint), fix any < 3/5
9. Emit single <artifact>

Updates stream live (in_progresscompleted per item). The web's apps/web/src/runtime/todos.ts parses these tool_use events and renders a live-updating Todos card.

Step 7 — checklist self-check. Skills that ship references/checklist.md are read; every P0 must pass before emitting <artifact>.

Step 8 — five-dimensional critique. The agent silently scores itself 1–5 across:

  1. Philosophy — visual posture matches what was asked, no drift to default.
  2. Hierarchy — eye lands in one obvious place per screen.
  3. Execution — typography, spacing, alignment, contrast.
  4. Specificity — every word/number/image specific to this brief.
  5. Restraint — one accent at most twice, one decisive flourish, not three.

Any dimension < 3/5 triggers a fix pass. Two passes is described as normal.

6. The deck framework contract

prompts/deck-framework.ts:38-374 is a non-negotiable HTML scaffold for kind=deck projects when no skill seed is bound. It's the load-bearing nav / counter / scroll JS / print stylesheet contract that the daemon's PDF stitching depends on.

Lines 38-308 — the verbatim skeleton: 1920×1080 canvas, scale-to-fit JS with transform-origin: top-left, prev/next + counter outside the scaled stage, localStorage position restore, @media print multi-page PDF.

Lines 310-374 — the directive: agent copies skeleton verbatim, fills SLOT content only. Lines 343-353 list 9 specific "Common drift modes — DO NOT DO THESE" to prevent regenerating bugs the team has already fixed.

This is opinionated contract design: instead of teaching the agent how to build a deck framework, it tells the agent which exact HTML to paste.

7. The media generation contract

prompts/media-contract.ts:37-340 ensures image/video/audio surfaces use a unified shell-dispatch contract.

Key rules the agent absorbs:

This is how OD unifies media generation across all 12 CLIs without a custom tool per CLI — the integration point is shell.

8. Anti-AI-slop linter (apps/daemon/src/lint-artifact.ts)

This is where the prompt-side guardrails meet automated enforcement. After every artifact save, the linter scans the HTML and returns LintFinding[]. P0 findings are surfaced back to the agent as a system message so the next turn fixes them.

ID Lines Pattern Severity
purple-gradient 124-140 Violet/purple hex in linear-gradient() P0
trust-gradient 155-177 Two-stop blue→cyan gradient P0
ai-default-indigo 197-221 #6366f1 / #4f46e5 / #7c3aed outside token defs P0
emoji-icon 243-269 ✨🚀🎯⚡🔥💡📈 in heading/button/list context P0
left-accent-card 279-294 Rounded card + colored border-left P0
sans-display 306-323 h1/h2/h3 with font-family: Inter / Roboto / system-sans P0
invented-metric 333-347 "10× faster", "99.9% uptime", "zero-downtime" P0
filler-copy 356-370 "Feature one/two/three", "lorem ipsum", "placeholder text" P0
scroll-into-view 379-387 .scrollIntoView() (breaks iframe previews) P0
all-caps-no-tracking 410-449 text-transform: uppercase w/o letter-spacing ≥ 0.06em P1
(token tracking validation) 452-579 Cross-check against CSS token definitions P1

Wired into POST /api/artifacts/save and POST /api/artifacts/lint. The list of indigo hexes is kept in sync with craft/anti-ai-slop.md so the prompt's contract and the linter's rules agree.

9. The craft layer

craft/<slug>.md files are universal brand-agnostic rules a skill opts into via od.craft.requires: [typography, color, anti-ai-slop, ...]. apps/daemon/src/craft.ts reads the matching files, concatenates them with section headers, and hands them to composeSystemPrompt. They get injected between DESIGN.md and the skill body.

Three rules govern how craft, brand, and skill interact:

  1. Brand DESIGN.md tokens win on token values.
  2. Craft rules cover everything brand doesn't override (letter-spacing, accent budget caps, anti-slop patterns).
  3. Skill workflow wins on workflow shape (whether to ship a base.html, etc.).

Missing craft files are dropped silently (craft.ts:21-46) — a skill can forward-reference craft/motion.md before that file exists, and adding the file later just lights up the rule.

10. Permission posture (per agent)

Every CLI is launched with auto-approve flags so the model doesn't hang waiting for an interactive prompt the web has no UI to surface (agents.ts:60-65). Specifics:

Agent Flag
Claude Code --permission-mode bypassPermissions, optional --include-partial-messages (probed)
Codex --full-auto --skip-git-repo-check, model-specific reasoning effort clamping
Devin --permission-mode dangerous --respect-workspace-trust false acp
OpenCode --dangerously-skip-permissions
GitHub Copilot CLI --allow-all-tools --add-dir <skills> --add-dir <design-systems>
Gemini CLI --skip-trust --yolo, prompt via stdin (avoids ENAMETOOLONG)
Cursor Agent workspace pinned via --workspace <cwd>
ACP-driven (Devin/Hermes/Kimi/Kiro) auto-approve in ACP choosePermissionOutcome() (acp.ts:51-60)
Pi extension UI auto-replies (pi-rpc.ts:25-64)

Tradeoff stated bluntly in the file headers: cwd is the only sandbox; all reads/writes inside the project dir are allowed unconditionally. This is the bet "agent permission models already exist and we inherit them" but with auto-approve so the no-TTY web context doesn't deadlock.

11. Other prompt-side guardrails

12. Imports OD inherits, with attribution

Three substantial portions of the prompt stack are credited:

  1. alchaincyf/huashu-design — Junior-Designer mode, variations-not-answers, anti-AI-slop, embody-the-specialist, 5-step brand-spec extraction, "5 schools × 20 philosophies." Distilled into discovery.ts:174-250 and the direction library's spec.
  2. op7418/guizang-ppt-skill — pre-flight asset reads, P0 self-check, theme-rhythm rules. Bundled verbatim under skills/guizang-ppt/.
  3. refero_skill (MIT) — adapted into craft/anti-ai-slop.md and tightened to match the linter's enforcement surface.

Open Design's prompt-engineering contribution is the assembly logic — the order of concatenation, the metadata block that bridges create-time UI choices into mid-session form generation, the deck framework as a contract rather than a guideline, and the linter that closes the loop from agent output back into agent context.