Entry points & execution flow

1. Where execution starts

Surface	Entry	Owner
User command	`pnpm tools-dev`	`tools/dev/src/index.ts`
Daemon process	`apps/daemon/src/cli.ts` → `bootstrapSidecarRuntime()` → Express listen	`apps/daemon/`
Web process	`apps/web` Next.js sidecar (proxies `/api/*` to daemon)	`apps/web/sidecar/`
Desktop process	`apps/desktop/src/main/index.ts` → `runDesktopMain()` → `createDesktopRuntime()`	`apps/desktop/`
Packaged Electron	`apps/packaged/src/...` (starts daemon + web sidecars, registers `od://`)	`apps/packaged/`
`od` CLI subcommand	`apps/daemon/src/cli.ts` (default, `media generate`, `media wait`)	daemon repo

There's exactly one local entry point — pnpm tools-dev — and the AGENTS.md explicitly forbids restoring root aliases (pnpm dev, pnpm daemon, pnpm preview, pnpm start).

2. `pnpm tools-dev` lifecycle

tools/dev is a CAC-based CLI:

Subcommand	Purpose
(default) / `start`	Boot daemon + web (+ desktop on GUI machines) under stamped sidecar processes. Default port: 7456 daemon, configurable web.
`start <app>`	Boot just one app, wait for ready.
`run <app>`	Foreground daemon+web (used by Playwright `webServer`).
`stop` / `stop <app>`	Graceful SIGTERM, force SIGKILL after timeout.
`status [--json]`	Query each app's IPC socket; return state + URL.
`logs [--namespace] [--json]`	Tail `.tmp/tools-dev/<namespace>/<app>.log`.
`inspect <app> <action>`	Tunnel an IPC message: `daemon status`, `web status`, `desktop status \| screenshot \| eval \| click \| console \| shutdown`.
`check`	Validate path layout, no stray processes.

Flags: --daemon-port, --web-port, --namespace (default "default"). Ports are governed only by these flags; OD_PORT (daemon target) and OD_WEB_PORT (web listener) are exported into child env. NEXT_PORT is forbidden (per AGENTS.md).

3. Daemon boot path

apps/daemon/src/cli.ts
   └─ parseArgs() → { port, open }
   └─ if argv[2] === 'media' → run od media generate / wait subcommand
   └─ else: bootstrapSidecarRuntime(...) → resolves stamp + namespace + base paths
   └─ await import('./server.ts') → createServer({ port, ... })
       └─ openDatabase() — apps/daemon/src/db.ts:38 — creates tables/indexes idempotently
       └─ scan PATH for agents (parallel `which`) — agents.ts:detectAgents()
       └─ Express app + multer
       └─ register ~70 routes (server.ts:580-2300)
       └─ design.runs = createChatRunService(...) — runs.ts
       └─ app.listen(port)
       └─ if !nofs.open: open browser at http://localhost:<port>

Key state objects materialized at boot:

db — better-sqlite3 handle
design.runs — chat-run service (in-memory map of runId → SSE emitter, status, projectId, conversationId, …)
agentCapabilities Map — populated lazily by detectAgents()'s --help probes
liveModelCache Map — caches per-agent dynamic model lists

4. End-to-end "user types a brief" sequence

This is the canonical happy-path flow (from server.ts:1856-2080):

[1] Browser
    POST /api/projects { name, skill_id, design_system_id, metadata }
    → 200 { id }

[2] Browser shows EntryView → ProjectView; loads skill / design-system
    GET /api/projects/:id
    GET /api/skills · /api/skills/:id
    GET /api/design-systems · /api/design-systems/:id

[3] User types message; ChatComposer POSTs:
    POST /api/chat { agentId, projectId, conversationId, message, skillId,
                     designSystemId, model, attachments, commentAttachments }

[4] Daemon /api/chat handler (server.ts:2174):
    run = design.runs.create()
    design.runs.stream(run, req, res)        # opens SSE: text/event-stream
    design.runs.start(run, () => startChatRun(req.body, run))

[5] startChatRun() (server.ts:1856):
    a. Validate + persist user message via upsertMessage().
    b. cwd = ensureProject(PROJECTS_DIR, projectId)   # creates dir if missing
    c. composeDaemonSystemPrompt({ projectId, skillId, designSystemId }):
         · fetch skill body  (skills.ts:listSkills/readSkill)
         · fetch design-system body  (design-systems.ts)
         · resolve craft refs from skill's `od.craft.requires`  (craft.ts)
         · composeSystemPrompt(...)  (prompts/system.ts:109)
              [DISCOVERY_AND_PHILOSOPHY]
              + [official designer base prompt]
              + [DESIGN.md if any]
              + [Active craft references … if any]
              + [Active skill ... + Pre-flight (Read template.html…)]
              + [Project metadata block]
              + [DECK_FRAMEWORK_DIRECTIVE  if isDeckProject && !hasSkillSeed]
              + [MEDIA_GENERATION_CONTRACT  if image/video/audio]
    d. Build cwdHint + filesListBlock + attachmentHint + commentHint
       and wrap them under "# Instructions (read first)\n" in the user
       message (because most CLIs don't have a separate system channel).
    e. extraAllowedDirs = [SKILLS_DIR, DESIGN_SYSTEMS_DIR]  (for --add-dir)
    f. agentOptions = { model: safeModel, reasoning: clampedReasoning }
    g. resolvedBin = resolveAgentBin(agentId)
       if missing → SSE error AGENT_UNAVAILABLE; finish run failed
    h. args = def.buildArgs(composedPrompt, safeImages,
                            extraAllowedDirs, agentOptions, { cwd })

[6] child = spawn(invocation.command, invocation.args, {
        cwd, stdio: ['pipe', 'pipe', 'pipe'],
        env: { OD_BIN, OD_DAEMON_URL: 'http://127.0.0.1:<port>',
               OD_PROJECT_ID, ...process.env },
    })

[7] Stream parsing keyed off def.streamFormat:

      'claude-stream-json'  → createClaudeStreamHandler()
            line-delimited JSON, accumulates partial tool_use chunks,
            emits typed events (status / text_delta / thinking_delta /
            tool_use / tool_result / usage / done)
      'copilot-stream-json' → createCopilotStreamHandler()
            same shape, mapped from Copilot's assistant.* + tool.* events
      'acp-json-rpc'        → attachAcpSession()
            ACP initialize → session/new → session/prompt;
            map turn_start / message_update / tool_use / tool_result /
            turn_end → typed events; auto-approve permission requests
      'pi-rpc'              → attachPiRpcSession()
            same as ACP minus the workspace-trust dance, plus
            extension UI auto-replies (confirm/select/input/editor)
      'json-event-stream'   → createJsonEventStreamHandler()
            generic line-JSON parser for codex/gemini/opencode/cursor-agent
      'plain'               → forward chunks verbatim

[8] For every emitted event:
       (a) design.runs.emit(run, eventType, payload) → SSE to browser
       (b) upsertMessage(db, msgId, { events_json: [...prev, event] })
       (c) on 'tool_use' name === 'TodoWrite' → web parses todos.ts
       (d) on artifact emission → POST /api/artifacts/save:
              lintArtifact(html) → P0/P1/P2 findings →
              renderFindingsForAgent(findings) → injected back as next-turn
              system message so the agent self-corrects.

[9] On exit:
    upsertMessage(...)             persist final state
    design.runs.finish(run, status, exitCode, signal)
    SSE 'done' event closes the stream

[10] Browser:
    PreviewModal/iframe srcdoc → updates as artifact lands
    FileWorkspace shows newly written files
    Comment mode arms once preview loads (apps/web/src/comments.ts)

5. The `od media generate` subagent path

When the project metadata's kind is image/video/audio, composeSystemPrompt appends MEDIA_GENERATION_CONTRACT (prompts/media-contract.ts:37-340). That contract instructs the agent to dispatch via shell rather than fabricating bytes:

[agent emits: Bash tool call]
  od media generate --surface image --model gpt-image-2 --prompt "..." \
                    --aspect 1:1 --output illustration.png
                          │
                          ▼
[apps/daemon/src/cli.ts media subcommand]
  reads OD_DAEMON_URL + OD_PROJECT_ID from env (injected by spawn() above)
  POSTs /api/projects/:id/media/generate
                          │
                          ▼
[server.ts:1600-1681]
  generateMedia() in media.ts:
    findProvider(modelId) → 'openai' | 'volcengine' | 'grok' | 'hyperframes'
    For sync providers (OpenAI, Grok-image): fetch /v1/images/generations
    For async (Volcengine, Grok-video): submit task, return taskId
    Apply clampNumber() to durations / lengths
    Save bytes to project cwd at --output path
    Return { file: { name, size, kind, mime, ... } }
                          │
                          ▼
[CLI prints JSON descriptor to stdout]
  Agent reads the descriptor, narrates the filename in chat,
  FileViewer renders the new media in the iframe automatically.

For long-running async tasks the agent uses od media wait --task <id> which polls /api/media/tasks/:id/wait. This unifies image/video/audio across every agent CLI without writing a custom tool per CLI — the contract is shell.

6. The BYOK proxy path (Topology C)

When the user has no CLI installed and types in the BYOK fallback, the browser POSTs directly to /api/proxy/anthropic/stream or /api/proxy/openai/stream (server.ts:2209+). The daemon:

validateExternalApiBaseUrl() rejects internal IPs, non-http(s) schemes.
Composes the upstream URL (/v1/messages or /v1/chat/completions).
fetch(...) upstream with the user's apiKey and a fixed anthropic-version: 2023-06-01.
Streams the upstream body line-by-line, parsing event: ...\ndata: {...}\n\n blocks and re-emitting them on the browser SSE.
On error, redacts Bearer <token> before logging.

There's no daemon-side memory of the BYOK key — it's passed through per-request from the browser.

7. Comment mode → surgical edit

[browser]
  Comment-mode click in iframe → DOM snapshot
  apps/web/src/comments.ts:targetFromSnapshot(...)
  → PreviewCommentTarget { path, selector, position, html_hint, text }
  POST /api/projects/:id/conversations/:cid/comments? (stored in
       preview_comments table via upsertPreviewComment())

[next chat turn]
  ChatComposer attaches commentAttachments[] to /api/chat body.
  startChatRun() builds commentHint via renderCommentAttachmentHint(...)
  and appends it to the composed user message: "User added a comment to
  element <selector> in <path>: '<text>'." The agent receives a surgical
  edit instruction; capable agents (Claude Code, Devin, Copilot) use the
  Edit tool to change just that region.

Comment mode is gated by agent.capabilities.surgicalEdit. For weaker agents (Gemini, plain-text) the daemon documents this gap rather than silently degrading.

8. State transitions for a chat run

            create()
               │
               ▼
         pending ─── start() ──► running
                                   │
              ┌────────────────────┼─────────────────────┐
              ▼                    ▼                     ▼
      cancel() requested      child stdin closes     child exits
              │                    │                     │
              ▼                    ▼                     ▼
       SIGTERM child         drain stdout buffer     finish(status,
       wait then SIGKILL     emit final 'done'       exitCode, signal)
              │                    │                     │
              ▼                    ▼                     ▼
          finish('cancelled')  finish('completed')   finish('failed')

design.runs is in-memory only — restarting the daemon drops all in-flight runs. Persisted state is in messages.events_json so the browser can replay the agent's most recent transcript on reload.