LLM Integration

How Superset leverages Large Language Models — what models, what prompts, what tools, what guardrails, what streaming, what's clever, what's risky.

Superset has two layers of LLM usage:

Orchestration layer — Superset runs third-party agent CLIs (Claude Code, Codex, Cursor Agent, Gemini CLI, …) as black boxes inside a PTY. The LLM runs inside those agents; Superset is the harness around them.
Internal layer — Superset also makes LLM calls itself for in-app chat (Mastracode runtime), title generation, observer/reflector flows, and small auxiliary tasks.

Understanding both is essential — and noticing the difference is half the lesson.

1. Models In Use

Model family	Used for	Where
`claude-haiku-4-5-20251001`	"Small model" tasks: title generation, observer/reflector models in Mastra	`packages/chat/src/server/shared/small-model/get-small-model.ts:157-178`
`gpt-4o-mini`	Same — fallback when Anthropic auth not present	same
Whatever the user's Claude/Codex/Cursor CLI uses	The actual coding agent — Superset doesn't pick the model, the agent does	each agent's own config (e.g., `~/.claude/`)

Auth resolution order for the small-model: env vars → mastracode storage (API key) → Claude OAuth tokens → OpenAI fallback. This is the file that contains the OAuth-via-Claude-Code path with special headers (get-small-model.ts:157-178).

2. SDKs & Frameworks

Dependency	Role
`@ai-sdk/anthropic` 3.0.43	Anthropic provider for Vercel AI SDK
`@ai-sdk/openai` 3.0.36	OpenAI provider
`@ai-sdk/react` 3.0	UI streaming hooks (`useChat`, `useObject`, etc.)
`ai` 6.0	Unified Vercel AI SDK
`@mastra/core` 1.26.0-alpha.3	Mastracode harness — agent runtime with hooks, tool approvals, observer/reflector
`@mastra/mcp`	MCP client that Mastracode uses to talk to Superset's own MCP server

The presence of both Vercel AI SDK and Mastracode is intentional: Mastra owns the agent loop (system prompts, tool-use, approvals); Vercel AI SDK is used at the edges (UI streaming via @ai-sdk/react, one-shot title generation).

3. Prompt Templates

Superset's own prompts (i.e. those it composes to the agent on the user's behalf) live in packages/shared/src/agent-prompt-template.ts.

3.1 Default terminal task prompt (`agent-prompt-template.ts:62-78`)

Task: "{{title}}" ({{slug}})
Priority: {{priority}}
Status: {{statusName}}
Labels: {{labels}}
 
{{description}}
 
Work in the current workspace. Inspect the relevant code, make the needed changes, verify them when practical, and update task "{{id}}" with a short summary when done.

3.2 Default chat task prompt (`agent-prompt-template.ts:64+`)

Task: "{{title}}" ({{slug}})
Priority: {{priority}}
Status: {{statusName}}
Labels: {{labels}}
 
{{description}}
 
Help with this task in the current workspace and take the next concrete step.

The terminal version is more directive ("make the needed changes, verify them when practical, update the task when done"). The chat version is conversational ("take the next concrete step").

3.3 Context prompt template (user-level)

{{userPrompt}}
 
{{tasks}}
 
{{issues}}
 
{{prs}}
 
{{attachments}}

Default system template is empty — the agent's own system prompt is preserved. Superset never overrides it. Each agent (Claude/Codex/Cursor) has its own per-agent template that drops in dialect-specific fences (XML for Claude, markdown for Codex/Cursor).

3.4 Title generation prompt (`packages/chat/src/server/desktop/title-generation/title-generation.ts:61`)

instructions: params.instructions ?? "You generate concise titles."

Run by a Mastra Agent against the small model (Haiku or 4o-mini) using agent.generateTitleFromUserMessage({ message, tracingContext }).

3.5 What you won't find

There is no monolithic "system prompt for the coding agent" inside Superset itself. That's deliberate — the coding agents (Claude Code, Codex, etc.) ship with their own system prompts; Superset's role is to deliver context, not to override their personalities. The closest thing to a Superset-authored coding-agent prompt is the task template above, which composes into the agent's user message.

4. Tool Surface (MCP)

Superset exposes MCP tools at three places: the cloud API (apps/api/api/agent/[transport]), the desktop MCP (in-process, for Claude Code / Codex when run locally), and the host-service's mastracode harness (which calls back to the cloud MCP).

4.1 MCP v1 — `packages/mcp/src/tools/index.ts`

17 tools, organized into:

Category	Tools
Devices/workspaces	`listDevices`, `listWorkspaces`, `listProjects`, `getWorkspaceDetails`, `createWorkspace`, `switchWorkspace`, `deleteWorkspace`, `updateWorkspace`, `getAppContext`
Tasks	`createTask`, `updateTask`, `listTasks`, `getTask`, `deleteTask`, `listTaskStatuses`
Org	`listMembers`
Agent sessions	`startAgentSession`

startAgentSession is the meta tool — it lets one agent dispatch another to a different workspace.

4.2 MCP v2 — `packages/mcp-v2/src/tools/register.ts`

Adds automations (cron-like recurring agent runs):

Category	Tools
Automations	`automations_create`, `_list`, `_get`, `_update`, `_delete`, `_pause`, `_resume`, `_run`, `_logs`
	`automations_get_prompt`, `automations_set_prompt` ← exposes prompt editing to agents

Plus enhanced workspace/project/host tools.

The promotion of "edit your own automation prompt" to a first-class tool is a notable design choice — it lets a long-running agent rewrite its own future-instance instructions over time.

4.3 Tool definition style (`packages/mcp-v2/src/tools/automations/create.ts`)

defineTool(server, {
  name: "automations_create",
  description: "Schedule a recurring agent run...",
  inputSchema: {
    name: z.string().min(1).max(200),
    prompt: z.string().min(1).max(100_000),
    agentConfig: z.object({
      id: z.string().min(1),
      kind: z.enum(["terminal", "chat"]),
    }).passthrough(),
    rrule: z.string(), // RFC 5545
    // … timezone, mcpScope, etc.
  },
  handler: async (input, ctx) => caller.automation.create(input),
});

Patterns to notice:

Zod schemas double as JSON Schema (via zod-to-json-schema) for the MCP advert.
Hard caps on prompt length (100K chars), name length (200), and so on.
The handler is a thin shim over a tRPC caller — same business logic for tRPC and MCP.

4.4 Tool registration with emitter (`packages/mcp-v2/src/tools/register.ts:57-65`)

export function registerTools(server: McpServer, options?: RegisterToolsOptions): void {
  setServerToolCallEmitter(server, options?.onToolCall);
  for (const mod of REGISTRARS) mod.register(server);
}

The emitter injects an audit hook — every tool call goes through it, allowing logging/usage tracking without touching individual tools.

5. Agent Presets — Launching The Real CLIs

packages/host-service/src/trpc/router/settings/agent-presets.ts:31-126 is where the matrix of supported agents lives:

{ presetId: "claude", label: "Claude",
  command: "claude", args: ["--permission-mode", "acceptEdits"],
  promptTransport: "argv", promptArgs: [] },
 
{ presetId: "codex", label: "Codex",
  command: "codex", args: ["-c", 'model_reasoning_effort="high"', …],
  promptTransport: "argv", promptArgs: ["--"] },
 
{ presetId: "gemini", label: "Gemini",
  command: "gemini", args: ["--approval-mode=auto_edit"],
  promptTransport: "argv", promptArgs: [] },
 
{ presetId: "mastracode", label: "Mastracode",
  command: "mastracode", promptTransport: "argv", promptArgs: ["--prompt"] },
 
// + claude-yolo, codex-yolo, opencode, cursor-agent, copilot, amp, pi, …

promptTransport: either "argv" (the prompt is a CLI arg) or "stdin" (the prompt is piped). promptArgs are the flags that introduce the prompt arg.

The "yolo" variants are looser-permission profiles (e.g., --dangerously-skip-permissions) — kept as separate presets, not flag toggles, so the user has to opt in deliberately.

5.1 Wrapper hook injection (`apps/desktop/src/main/lib/agent-setup/`)

For each agent type Superset writes a config file with hooks:

const DESKTOP_AGENT_SETUP_RUNNERS: Record<DesktopAgentSetupAction, () => void> = {
  "claude-wrapper": createClaudeWrapper,
  "codex-wrapper": createCodexWrapper,
  "cursor-agent-wrapper": createCursorAgentWrapper,
  "gemini-wrapper": createGeminiWrapper,
  "mastra-wrapper": createMastraWrapper,
  "copilot-hook-script": createCopilotHookScript,
  // 21 total
};

For Claude Code: ~/.claude/settings.json is patched to add a managed hook block:

[ -n "$SUPERSET_HOME_DIR" ] && [ -x "$SUPERSET_HOME_DIR/hooks/notify" ] && "$SUPERSET_HOME_DIR/hooks/notify" || true

The block is rewritten on each launch (idempotent, marker-fenced) so users editing their own settings.json doesn't break Superset and vice versa.

This pattern — PATH rewrite + hook config + binary unchanged — is the clever bit of the orchestration architecture.

6. Mastracode Harness — Internal Chat Runtime

packages/chat/src/server/trpc/service.ts:161-202 shows initialization:

const omModel = resolveOmModelFromAuth();
const runtime = await createMastraCode({
  cwd: runtimeCwd,
  extraTools,
  disableMcp: !ENABLE_MASTRA_MCP_SERVERS, // currently false
  ...(omModel && {
    initialState: {
      observerModelId: omModel,
      reflectorModelId: omModel,
    },
  }),
});
runtime.harness.init();
runtime.harness.selectOrCreateThread();

Harness API:

sendMessage(payload) — submit messages with optional files
respondToQuestion(payload) — answer sandbox questions
respondToToolApproval(decision) — approve/decline tool use
respondToPlanApproval(response) — accept/reject plans
listMessages() — history
getDisplayState() — pending questions/approvals/errors

6.1 MCP injection from Superset

packages/chat/src/server/trpc/utils/runtime/superset-mcp.ts:1-43:

import { MCPClient } from "@mastra/mcp";
 
export async function getSupersetMcpTools(headers, apiUrl) {
  const client = new MCPClient({
    id: `superset-mcp-${Date.now()}`,
    servers: {
      superset: {
        url: new URL(`${apiUrl}/api/agent/mcp`),
        fetch: async (url, init) => {
          const merged = new Headers(init?.headers);
          for (const [k, v] of Object.entries(await headers())) merged.set(k, v);
          return fetch(url, { ...init, headers: merged });
        },
      },
    },
  });
  return (await client.listTools()) as Record<string, MastraExtraTool>;
}

The harness gets Superset's tools (tasks, workspaces, automations, …) and surfaces them to the LLM. Auth is forwarded via headers().

6.2 Hooks (guardrails)

packages/chat/src/server/trpc/utils/runtime/runtime.ts:135-144 shows the user-prompt-submit gate:

export async function onUserPromptSubmit(runtime, userMessage): Promise<void> {
  if (!runtime.hookManager) return;
  const result = await runtime.hookManager.runUserPromptSubmit(userMessage);
  if (!result.allowed) {
    throw new Error(result.blockReason ?? "Blocked by UserPromptSubmit hook");
  }
}

Available lifecycle hooks: SessionStart, UserPromptSubmit, Stop, SessionEnd — all manageable from .mastra/hooks/ config (or .claude/hooks/ for Claude Code-style hook reuse).

Hooks can:

Block prompts (above)
Inject context
Notify external systems
Veto tool use

The hooks layer is the main programmable guardrail. Pre-run rate limiting, content filters, secret-redaction — all live as hooks rather than hard-coded in Superset.

7. Streaming

7.1 Chat streaming — Durable Streams

apps/api/src/app/api/chat/lib.ts:

export const PROTOCOL_QUERY_PARAMS = ["offset", "live", "cursor"];
export const PROTOCOL_RESPONSE_HEADERS = [
  "stream-next-offset", "stream-cursor", "stream-up-to-date", "stream-closed",
  "content-type", "cache-control", "etag",
];
 
export function streamUrl(sessionId: string) {
  return `${env.DURABLE_STREAMS_URL}/sessions/${sessionId}`;
}
 
export async function appendToStream(sessionId, event) {
  const response = await fetch(streamUrl(sessionId), {
    method: "POST",
    headers: { Authorization: `Bearer ${env.DURABLE_STREAMS_SECRET}`, "Content-Type": "application/json" },
    body: event,
  });
  if (!response.ok) throw new Error(`Stream append failed: ${response.status}`);
}

Durable Streams provide a resumable SSE pipeline: each event has an offset, the client tracks stream-next-offset, on reconnect it resumes from there. Tab close, network blip, even server-side restart — the stream picks up.

apps/api/src/app/api/chat/[sessionId]/stream/route.ts:18-72 is the GET proxy: validates auth, forwards offset/live/cursor, returns 204 if up-to-date, otherwise streams.

7.2 Terminal streaming — relay tunnel

Different problem, different transport. PTY frames are bursty, bidirectional, and high-volume. They go through apps/relay/src/index.ts:66-174 — a Hono WebSocket app on Fly.io:

app.get("/tunnel", upgradeWebSocket((c) => {
  const hostId = c.req.query("hostId");
  return {
    onOpen:    async (_e, ws) => tunnelManager.register(hostId, token, ws),
    onMessage: (event)        => tunnelManager.handleMessage(hostId, event.data),
    onClose:   ()             => tunnelManager.unregister(hostId),
  };
}));

The relay is not for LLM tokens; it's for terminal frames (and other host-routed traffic).

8. Guardrails

A snapshot of every place safety is enforced:

Concern	Enforcement	Where
Block disallowed prompts	`UserPromptSubmit` hook → throws if `result.allowed === false`	`packages/chat/src/server/trpc/utils/runtime/runtime.ts:135-144`
Tool approvals (sandboxing)	Mastracode `respondToToolApproval` flow surfaces tool calls to the user before execution	`@mastra/core`
Plan approvals	`respondToPlanApproval` — multi-step plans reviewed before execution	same
Attachment size	Max 10 attachments / 50MB each / 200MB total per message; filename sanitization; base64 validation	`apps/desktop/src/renderer/lib/agent-session-orchestrator/adapters/terminal-adapter.ts:46-100`
Prompt length	Hard-capped per tool definition (e.g., automations: 100K chars)	Zod schemas in `packages/mcp-v2/src/tools/`
Secret stores	`secrets` table (org × project) for encrypted env vars	`packages/db/src/schema/schema.ts`
Quote-injection in shell	`buildPromptCommandString` derives unique heredoc delimiter; never interpolates	`packages/shared/src/agent-prompt-launch.ts:26-68`
Lateral movement between orgs	Better Auth `activeOrganizationId` scoping; tRPC context restricts queries	`packages/auth/src/server.ts`, `apps/api/src/trpc/context.ts`
Cross-machine command auth	Relay verifies JWT before tunneling	`apps/relay/src/auth.ts`
Electric SQL data leakage	Cloudflare Worker filters tables by `organizationId`	`apps/electric-proxy/src/index.ts`
Rate-limiting	Not in-tree as a dedicated module — relies on Stripe billing tier + Better Auth + edge proxies	Implicit

Note the absence of a built-in PII redactor or content-policy filter — Superset trusts the underlying agent's safety stack and adds workflow guardrails (approvals, hooks) instead of output guardrails. This is appropriate for a developer tool whose users are themselves the audience.

9. The "Skill Preload" Story (Removed Feature)

docs/skill-preload-feature.md documents an LLM feature that didn't ship:

Idea: extract /command chips from the user's message and pre-attach matching skills (skill files in .mastracode/skills/ or .claude/skills/) as hints to the agent.
Implementation: required a Mastra fork to accept preloadSkills metadata.
Why removed: upstream Mastra 1.26.0-alpha+ ships search_skills and load_skill as native tools — agents discover skills autonomously.

Lesson: the team chose to remove the workaround once upstream caught up rather than keep maintaining a fork.

10. The Two Most Worth-Reading Files

If you only have time for two LLM-related files:

packages/shared/src/agent-prompt-template.ts — small, dense, reads like a spec. It's where the prompt contract between Superset and any agent is defined.
apps/desktop/src/renderer/lib/agent-session-orchestrator/agent-session-orchestrator.ts — the dispatch logic that turns a "user wants an agent to handle this" intent into a running PTY process. It's the seam between Superset's UX and the agent ecosystem.

For wider context after those two:

packages/chat/src/server/trpc/service.ts — chat runtime wiring
packages/host-service/src/trpc/router/settings/agent-presets.ts — the agent matrix
packages/mcp-v2/src/tools/register.ts — tool catalogue
packages/chat/src/server/trpc/utils/runtime/superset-mcp.ts — MCP injection bridge
apps/desktop/src/main/lib/agent-setup/agent-wrappers-claude-codex-opencode.ts — hook injection mechanics

11. Patterns Worth Stealing

PATH-rewriting + hook injection to instrument an unmodified third-party CLI. ~/.superset/bin/ shims + per-agent hook config files. Cheap, robust, vendor-agnostic.
Per-agent prompt templates with a per-agent dialect (XML for Claude, markdown for Codex). The same LaunchSource[] composes to different surface forms.
scope: "system" | "user" on every context section, with cache-control hints, paving the way for prompt-caching without bespoke per-agent code.
Bytes through IPC, encode at the provider boundary. Don't double-base64 your way through every internal hop.
Heredoc with random-id delimiter when invoking subprocesses with arbitrary text — never --prompt "$(echo $userText)".
MCP tool emitter for audit/observability, set once, applied to every tool.
Resumable SSE via offset+cursor for long agent streams (tab close shouldn't lose state).
UserPromptSubmit hook with veto power as the user-replaceable safety layer, instead of baking policy into core.
auto for base-branch-source classification at the edge; carry the resolved kind through the chain so internal code never re-classifies.