Entry points & execution flow
1. Where execution starts
| Surface | Entry | Owner |
|---|---|---|
| User command | pnpm tools-dev |
tools/dev/src/index.ts |
| Daemon process | apps/daemon/src/cli.ts → bootstrapSidecarRuntime() → Express listen |
apps/daemon/ |
| Web process | apps/web Next.js sidecar (proxies /api/* to daemon) |
apps/web/sidecar/ |
| Desktop process | apps/desktop/src/main/index.ts → runDesktopMain() → createDesktopRuntime() |
apps/desktop/ |
| Packaged Electron | apps/packaged/src/... (starts daemon + web sidecars, registers od://) |
apps/packaged/ |
od CLI subcommand |
apps/daemon/src/cli.ts (default, media generate, media wait) |
daemon repo |
There's exactly one local entry point — pnpm tools-dev — and the AGENTS.md explicitly forbids restoring root aliases (pnpm dev, pnpm daemon, pnpm preview, pnpm start).
2. pnpm tools-dev lifecycle
tools/dev is a CAC-based CLI:
| Subcommand | Purpose |
|---|---|
(default) / start |
Boot daemon + web (+ desktop on GUI machines) under stamped sidecar processes. Default port: 7456 daemon, configurable web. |
start <app> |
Boot just one app, wait for ready. |
run <app> |
Foreground daemon+web (used by Playwright webServer). |
stop / stop <app> |
Graceful SIGTERM, force SIGKILL after timeout. |
status [--json] |
Query each app's IPC socket; return state + URL. |
logs [--namespace] [--json] |
Tail .tmp/tools-dev/<namespace>/<app>.log. |
inspect <app> <action> |
Tunnel an IPC message: daemon status, web status, desktop status | screenshot | eval | click | console | shutdown. |
check |
Validate path layout, no stray processes. |
Flags: --daemon-port, --web-port, --namespace (default "default"). Ports are governed only by these flags; OD_PORT (daemon target) and OD_WEB_PORT (web listener) are exported into child env. NEXT_PORT is forbidden (per AGENTS.md).
3. Daemon boot path
apps/daemon/src/cli.ts
└─ parseArgs() → { port, open }
└─ if argv[2] === 'media' → run od media generate / wait subcommand
└─ else: bootstrapSidecarRuntime(...) → resolves stamp + namespace + base paths
└─ await import('./server.ts') → createServer({ port, ... })
└─ openDatabase() — apps/daemon/src/db.ts:38 — creates tables/indexes idempotently
└─ scan PATH for agents (parallel `which`) — agents.ts:detectAgents()
└─ Express app + multer
└─ register ~70 routes (server.ts:580-2300)
└─ design.runs = createChatRunService(...) — runs.ts
└─ app.listen(port)
└─ if !nofs.open: open browser at http://localhost:<port>
Key state objects materialized at boot:
db— better-sqlite3 handledesign.runs— chat-run service (in-memory map of runId → SSE emitter, status, projectId, conversationId, …)agentCapabilitiesMap — populated lazily bydetectAgents()'s--helpprobesliveModelCacheMap — caches per-agent dynamic model lists
4. End-to-end "user types a brief" sequence
This is the canonical happy-path flow (from server.ts:1856-2080):
[1] Browser
POST /api/projects { name, skill_id, design_system_id, metadata }
→ 200 { id }
[2] Browser shows EntryView → ProjectView; loads skill / design-system
GET /api/projects/:id
GET /api/skills · /api/skills/:id
GET /api/design-systems · /api/design-systems/:id
[3] User types message; ChatComposer POSTs:
POST /api/chat { agentId, projectId, conversationId, message, skillId,
designSystemId, model, attachments, commentAttachments }
[4] Daemon /api/chat handler (server.ts:2174):
run = design.runs.create()
design.runs.stream(run, req, res) # opens SSE: text/event-stream
design.runs.start(run, () => startChatRun(req.body, run))
[5] startChatRun() (server.ts:1856):
a. Validate + persist user message via upsertMessage().
b. cwd = ensureProject(PROJECTS_DIR, projectId) # creates dir if missing
c. composeDaemonSystemPrompt({ projectId, skillId, designSystemId }):
· fetch skill body (skills.ts:listSkills/readSkill)
· fetch design-system body (design-systems.ts)
· resolve craft refs from skill's `od.craft.requires` (craft.ts)
· composeSystemPrompt(...) (prompts/system.ts:109)
[DISCOVERY_AND_PHILOSOPHY]
+ [official designer base prompt]
+ [DESIGN.md if any]
+ [Active craft references … if any]
+ [Active skill ... + Pre-flight (Read template.html…)]
+ [Project metadata block]
+ [DECK_FRAMEWORK_DIRECTIVE if isDeckProject && !hasSkillSeed]
+ [MEDIA_GENERATION_CONTRACT if image/video/audio]
d. Build cwdHint + filesListBlock + attachmentHint + commentHint
and wrap them under "# Instructions (read first)\n" in the user
message (because most CLIs don't have a separate system channel).
e. extraAllowedDirs = [SKILLS_DIR, DESIGN_SYSTEMS_DIR] (for --add-dir)
f. agentOptions = { model: safeModel, reasoning: clampedReasoning }
g. resolvedBin = resolveAgentBin(agentId)
if missing → SSE error AGENT_UNAVAILABLE; finish run failed
h. args = def.buildArgs(composedPrompt, safeImages,
extraAllowedDirs, agentOptions, { cwd })
[6] child = spawn(invocation.command, invocation.args, {
cwd, stdio: ['pipe', 'pipe', 'pipe'],
env: { OD_BIN, OD_DAEMON_URL: 'http://127.0.0.1:<port>',
OD_PROJECT_ID, ...process.env },
})
[7] Stream parsing keyed off def.streamFormat:
'claude-stream-json' → createClaudeStreamHandler()
line-delimited JSON, accumulates partial tool_use chunks,
emits typed events (status / text_delta / thinking_delta /
tool_use / tool_result / usage / done)
'copilot-stream-json' → createCopilotStreamHandler()
same shape, mapped from Copilot's assistant.* + tool.* events
'acp-json-rpc' → attachAcpSession()
ACP initialize → session/new → session/prompt;
map turn_start / message_update / tool_use / tool_result /
turn_end → typed events; auto-approve permission requests
'pi-rpc' → attachPiRpcSession()
same as ACP minus the workspace-trust dance, plus
extension UI auto-replies (confirm/select/input/editor)
'json-event-stream' → createJsonEventStreamHandler()
generic line-JSON parser for codex/gemini/opencode/cursor-agent
'plain' → forward chunks verbatim
[8] For every emitted event:
(a) design.runs.emit(run, eventType, payload) → SSE to browser
(b) upsertMessage(db, msgId, { events_json: [...prev, event] })
(c) on 'tool_use' name === 'TodoWrite' → web parses todos.ts
(d) on artifact emission → POST /api/artifacts/save:
lintArtifact(html) → P0/P1/P2 findings →
renderFindingsForAgent(findings) → injected back as next-turn
system message so the agent self-corrects.
[9] On exit:
upsertMessage(...) persist final state
design.runs.finish(run, status, exitCode, signal)
SSE 'done' event closes the stream
[10] Browser:
PreviewModal/iframe srcdoc → updates as artifact lands
FileWorkspace shows newly written files
Comment mode arms once preview loads (apps/web/src/comments.ts)
5. The od media generate subagent path
When the project metadata's kind is image/video/audio, composeSystemPrompt appends MEDIA_GENERATION_CONTRACT (prompts/media-contract.ts:37-340). That contract instructs the agent to dispatch via shell rather than fabricating bytes:
[agent emits: Bash tool call]
od media generate --surface image --model gpt-image-2 --prompt "..." \
--aspect 1:1 --output illustration.png
│
▼
[apps/daemon/src/cli.ts media subcommand]
reads OD_DAEMON_URL + OD_PROJECT_ID from env (injected by spawn() above)
POSTs /api/projects/:id/media/generate
│
▼
[server.ts:1600-1681]
generateMedia() in media.ts:
findProvider(modelId) → 'openai' | 'volcengine' | 'grok' | 'hyperframes'
For sync providers (OpenAI, Grok-image): fetch /v1/images/generations
For async (Volcengine, Grok-video): submit task, return taskId
Apply clampNumber() to durations / lengths
Save bytes to project cwd at --output path
Return { file: { name, size, kind, mime, ... } }
│
▼
[CLI prints JSON descriptor to stdout]
Agent reads the descriptor, narrates the filename in chat,
FileViewer renders the new media in the iframe automatically.
For long-running async tasks the agent uses od media wait --task <id> which polls /api/media/tasks/:id/wait. This unifies image/video/audio across every agent CLI without writing a custom tool per CLI — the contract is shell.
6. The BYOK proxy path (Topology C)
When the user has no CLI installed and types in the BYOK fallback, the browser POSTs directly to /api/proxy/anthropic/stream or /api/proxy/openai/stream (server.ts:2209+). The daemon:
validateExternalApiBaseUrl()rejects internal IPs, non-http(s) schemes.- Composes the upstream URL (
/v1/messagesor/v1/chat/completions). fetch(...)upstream with the user'sapiKeyand a fixedanthropic-version: 2023-06-01.- Streams the upstream body line-by-line, parsing
event: ...\ndata: {...}\n\nblocks and re-emitting them on the browser SSE. - On error, redacts
Bearer <token>before logging.
There's no daemon-side memory of the BYOK key — it's passed through per-request from the browser.
7. Comment mode → surgical edit
[browser]
Comment-mode click in iframe → DOM snapshot
apps/web/src/comments.ts:targetFromSnapshot(...)
→ PreviewCommentTarget { path, selector, position, html_hint, text }
POST /api/projects/:id/conversations/:cid/comments? (stored in
preview_comments table via upsertPreviewComment())
[next chat turn]
ChatComposer attaches commentAttachments[] to /api/chat body.
startChatRun() builds commentHint via renderCommentAttachmentHint(...)
and appends it to the composed user message: "User added a comment to
element <selector> in <path>: '<text>'." The agent receives a surgical
edit instruction; capable agents (Claude Code, Devin, Copilot) use the
Edit tool to change just that region.
Comment mode is gated by agent.capabilities.surgicalEdit. For weaker agents (Gemini, plain-text) the daemon documents this gap rather than silently degrading.
8. State transitions for a chat run
create()
│
▼
pending ─── start() ──► running
│
┌────────────────────┼─────────────────────┐
▼ ▼ ▼
cancel() requested child stdin closes child exits
│ │ │
▼ ▼ ▼
SIGTERM child drain stdout buffer finish(status,
wait then SIGKILL emit final 'done' exitCode, signal)
│ │ │
▼ ▼ ▼
finish('cancelled') finish('completed') finish('failed')
design.runs is in-memory only — restarting the daemon drops all in-flight runs. Persisted state is in messages.events_json so the browser can replay the agent's most recent transcript on reload.