CodeDocs Vault

LLM Integration

How Superset leverages Large Language Models — what models, what prompts, what tools, what guardrails, what streaming, what's clever, what's risky.

Superset has two layers of LLM usage:

  1. Orchestration layer — Superset runs third-party agent CLIs (Claude Code, Codex, Cursor Agent, Gemini CLI, …) as black boxes inside a PTY. The LLM runs inside those agents; Superset is the harness around them.
  2. Internal layer — Superset also makes LLM calls itself for in-app chat (Mastracode runtime), title generation, observer/reflector flows, and small auxiliary tasks.

Understanding both is essential — and noticing the difference is half the lesson.


1. Models In Use

Model family Used for Where
claude-haiku-4-5-20251001 "Small model" tasks: title generation, observer/reflector models in Mastra packages/chat/src/server/shared/small-model/get-small-model.ts:157-178
gpt-4o-mini Same — fallback when Anthropic auth not present same
Whatever the user's Claude/Codex/Cursor CLI uses The actual coding agent — Superset doesn't pick the model, the agent does each agent's own config (e.g., ~/.claude/)

Auth resolution order for the small-model: env vars → mastracode storage (API key) → Claude OAuth tokens → OpenAI fallback. This is the file that contains the OAuth-via-Claude-Code path with special headers (get-small-model.ts:157-178).


2. SDKs & Frameworks

Dependency Role
@ai-sdk/anthropic 3.0.43 Anthropic provider for Vercel AI SDK
@ai-sdk/openai 3.0.36 OpenAI provider
@ai-sdk/react 3.0 UI streaming hooks (useChat, useObject, etc.)
ai 6.0 Unified Vercel AI SDK
@mastra/core 1.26.0-alpha.3 Mastracode harness — agent runtime with hooks, tool approvals, observer/reflector
@mastra/mcp MCP client that Mastracode uses to talk to Superset's own MCP server

The presence of both Vercel AI SDK and Mastracode is intentional: Mastra owns the agent loop (system prompts, tool-use, approvals); Vercel AI SDK is used at the edges (UI streaming via @ai-sdk/react, one-shot title generation).


3. Prompt Templates

Superset's own prompts (i.e. those it composes to the agent on the user's behalf) live in packages/shared/src/agent-prompt-template.ts.

3.1 Default terminal task prompt (agent-prompt-template.ts:62-78)

Task: "{{title}}" ({{slug}})
Priority: {{priority}}
Status: {{statusName}}
Labels: {{labels}}
 
{{description}}
 
Work in the current workspace. Inspect the relevant code, make the needed changes, verify them when practical, and update task "{{id}}" with a short summary when done.

3.2 Default chat task prompt (agent-prompt-template.ts:64+)

Task: "{{title}}" ({{slug}})
Priority: {{priority}}
Status: {{statusName}}
Labels: {{labels}}
 
{{description}}
 
Help with this task in the current workspace and take the next concrete step.

The terminal version is more directive ("make the needed changes, verify them when practical, update the task when done"). The chat version is conversational ("take the next concrete step").

3.3 Context prompt template (user-level)

{{userPrompt}}
 
{{tasks}}
 
{{issues}}
 
{{prs}}
 
{{attachments}}

Default system template is empty — the agent's own system prompt is preserved. Superset never overrides it. Each agent (Claude/Codex/Cursor) has its own per-agent template that drops in dialect-specific fences (XML for Claude, markdown for Codex/Cursor).

3.4 Title generation prompt (packages/chat/src/server/desktop/title-generation/title-generation.ts:61)

instructions: params.instructions ?? "You generate concise titles."

Run by a Mastra Agent against the small model (Haiku or 4o-mini) using agent.generateTitleFromUserMessage({ message, tracingContext }).

3.5 What you won't find

There is no monolithic "system prompt for the coding agent" inside Superset itself. That's deliberate — the coding agents (Claude Code, Codex, etc.) ship with their own system prompts; Superset's role is to deliver context, not to override their personalities. The closest thing to a Superset-authored coding-agent prompt is the task template above, which composes into the agent's user message.


4. Tool Surface (MCP)

Superset exposes MCP tools at three places: the cloud API (apps/api/api/agent/[transport]), the desktop MCP (in-process, for Claude Code / Codex when run locally), and the host-service's mastracode harness (which calls back to the cloud MCP).

4.1 MCP v1 — packages/mcp/src/tools/index.ts

17 tools, organized into:

Category Tools
Devices/workspaces listDevices, listWorkspaces, listProjects, getWorkspaceDetails, createWorkspace, switchWorkspace, deleteWorkspace, updateWorkspace, getAppContext
Tasks createTask, updateTask, listTasks, getTask, deleteTask, listTaskStatuses
Org listMembers
Agent sessions startAgentSession

startAgentSession is the meta tool — it lets one agent dispatch another to a different workspace.

4.2 MCP v2 — packages/mcp-v2/src/tools/register.ts

Adds automations (cron-like recurring agent runs):

Category Tools
Automations automations_create, _list, _get, _update, _delete, _pause, _resume, _run, _logs
automations_get_prompt, automations_set_prompt ← exposes prompt editing to agents

Plus enhanced workspace/project/host tools.

The promotion of "edit your own automation prompt" to a first-class tool is a notable design choice — it lets a long-running agent rewrite its own future-instance instructions over time.

4.3 Tool definition style (packages/mcp-v2/src/tools/automations/create.ts)

defineTool(server, {
  name: "automations_create",
  description: "Schedule a recurring agent run...",
  inputSchema: {
    name: z.string().min(1).max(200),
    prompt: z.string().min(1).max(100_000),
    agentConfig: z.object({
      id: z.string().min(1),
      kind: z.enum(["terminal", "chat"]),
    }).passthrough(),
    rrule: z.string(), // RFC 5545
    // … timezone, mcpScope, etc.
  },
  handler: async (input, ctx) => caller.automation.create(input),
});

Patterns to notice:

4.4 Tool registration with emitter (packages/mcp-v2/src/tools/register.ts:57-65)

export function registerTools(server: McpServer, options?: RegisterToolsOptions): void {
  setServerToolCallEmitter(server, options?.onToolCall);
  for (const mod of REGISTRARS) mod.register(server);
}

The emitter injects an audit hook — every tool call goes through it, allowing logging/usage tracking without touching individual tools.


5. Agent Presets — Launching The Real CLIs

packages/host-service/src/trpc/router/settings/agent-presets.ts:31-126 is where the matrix of supported agents lives:

{ presetId: "claude", label: "Claude",
  command: "claude", args: ["--permission-mode", "acceptEdits"],
  promptTransport: "argv", promptArgs: [] },
 
{ presetId: "codex", label: "Codex",
  command: "codex", args: ["-c", 'model_reasoning_effort="high"', …],
  promptTransport: "argv", promptArgs: ["--"] },
 
{ presetId: "gemini", label: "Gemini",
  command: "gemini", args: ["--approval-mode=auto_edit"],
  promptTransport: "argv", promptArgs: [] },
 
{ presetId: "mastracode", label: "Mastracode",
  command: "mastracode", promptTransport: "argv", promptArgs: ["--prompt"] },
 
// + claude-yolo, codex-yolo, opencode, cursor-agent, copilot, amp, pi, …

promptTransport: either "argv" (the prompt is a CLI arg) or "stdin" (the prompt is piped). promptArgs are the flags that introduce the prompt arg.

The "yolo" variants are looser-permission profiles (e.g., --dangerously-skip-permissions) — kept as separate presets, not flag toggles, so the user has to opt in deliberately.

5.1 Wrapper hook injection (apps/desktop/src/main/lib/agent-setup/)

For each agent type Superset writes a config file with hooks:

const DESKTOP_AGENT_SETUP_RUNNERS: Record<DesktopAgentSetupAction, () => void> = {
  "claude-wrapper": createClaudeWrapper,
  "codex-wrapper": createCodexWrapper,
  "cursor-agent-wrapper": createCursorAgentWrapper,
  "gemini-wrapper": createGeminiWrapper,
  "mastra-wrapper": createMastraWrapper,
  "copilot-hook-script": createCopilotHookScript,
  // 21 total
};

For Claude Code: ~/.claude/settings.json is patched to add a managed hook block:

[ -n "$SUPERSET_HOME_DIR" ] && [ -x "$SUPERSET_HOME_DIR/hooks/notify" ] && "$SUPERSET_HOME_DIR/hooks/notify" || true

The block is rewritten on each launch (idempotent, marker-fenced) so users editing their own settings.json doesn't break Superset and vice versa.

This pattern — PATH rewrite + hook config + binary unchanged — is the clever bit of the orchestration architecture.


6. Mastracode Harness — Internal Chat Runtime

packages/chat/src/server/trpc/service.ts:161-202 shows initialization:

const omModel = resolveOmModelFromAuth();
const runtime = await createMastraCode({
  cwd: runtimeCwd,
  extraTools,
  disableMcp: !ENABLE_MASTRA_MCP_SERVERS, // currently false
  ...(omModel && {
    initialState: {
      observerModelId: omModel,
      reflectorModelId: omModel,
    },
  }),
});
runtime.harness.init();
runtime.harness.selectOrCreateThread();

Harness API:

6.1 MCP injection from Superset

packages/chat/src/server/trpc/utils/runtime/superset-mcp.ts:1-43:

import { MCPClient } from "@mastra/mcp";
 
export async function getSupersetMcpTools(headers, apiUrl) {
  const client = new MCPClient({
    id: `superset-mcp-${Date.now()}`,
    servers: {
      superset: {
        url: new URL(`${apiUrl}/api/agent/mcp`),
        fetch: async (url, init) => {
          const merged = new Headers(init?.headers);
          for (const [k, v] of Object.entries(await headers())) merged.set(k, v);
          return fetch(url, { ...init, headers: merged });
        },
      },
    },
  });
  return (await client.listTools()) as Record<string, MastraExtraTool>;
}

The harness gets Superset's tools (tasks, workspaces, automations, …) and surfaces them to the LLM. Auth is forwarded via headers().

6.2 Hooks (guardrails)

packages/chat/src/server/trpc/utils/runtime/runtime.ts:135-144 shows the user-prompt-submit gate:

export async function onUserPromptSubmit(runtime, userMessage): Promise<void> {
  if (!runtime.hookManager) return;
  const result = await runtime.hookManager.runUserPromptSubmit(userMessage);
  if (!result.allowed) {
    throw new Error(result.blockReason ?? "Blocked by UserPromptSubmit hook");
  }
}

Available lifecycle hooks: SessionStart, UserPromptSubmit, Stop, SessionEnd — all manageable from .mastra/hooks/ config (or .claude/hooks/ for Claude Code-style hook reuse).

Hooks can:

The hooks layer is the main programmable guardrail. Pre-run rate limiting, content filters, secret-redaction — all live as hooks rather than hard-coded in Superset.


7. Streaming

7.1 Chat streaming — Durable Streams

apps/api/src/app/api/chat/lib.ts:

export const PROTOCOL_QUERY_PARAMS = ["offset", "live", "cursor"];
export const PROTOCOL_RESPONSE_HEADERS = [
  "stream-next-offset", "stream-cursor", "stream-up-to-date", "stream-closed",
  "content-type", "cache-control", "etag",
];
 
export function streamUrl(sessionId: string) {
  return `${env.DURABLE_STREAMS_URL}/sessions/${sessionId}`;
}
 
export async function appendToStream(sessionId, event) {
  const response = await fetch(streamUrl(sessionId), {
    method: "POST",
    headers: { Authorization: `Bearer ${env.DURABLE_STREAMS_SECRET}`, "Content-Type": "application/json" },
    body: event,
  });
  if (!response.ok) throw new Error(`Stream append failed: ${response.status}`);
}

Durable Streams provide a resumable SSE pipeline: each event has an offset, the client tracks stream-next-offset, on reconnect it resumes from there. Tab close, network blip, even server-side restart — the stream picks up.

apps/api/src/app/api/chat/[sessionId]/stream/route.ts:18-72 is the GET proxy: validates auth, forwards offset/live/cursor, returns 204 if up-to-date, otherwise streams.

7.2 Terminal streaming — relay tunnel

Different problem, different transport. PTY frames are bursty, bidirectional, and high-volume. They go through apps/relay/src/index.ts:66-174 — a Hono WebSocket app on Fly.io:

app.get("/tunnel", upgradeWebSocket((c) => {
  const hostId = c.req.query("hostId");
  return {
    onOpen:    async (_e, ws) => tunnelManager.register(hostId, token, ws),
    onMessage: (event)        => tunnelManager.handleMessage(hostId, event.data),
    onClose:   ()             => tunnelManager.unregister(hostId),
  };
}));

The relay is not for LLM tokens; it's for terminal frames (and other host-routed traffic).


8. Guardrails

A snapshot of every place safety is enforced:

Concern Enforcement Where
Block disallowed prompts UserPromptSubmit hook → throws if result.allowed === false packages/chat/src/server/trpc/utils/runtime/runtime.ts:135-144
Tool approvals (sandboxing) Mastracode respondToToolApproval flow surfaces tool calls to the user before execution @mastra/core
Plan approvals respondToPlanApproval — multi-step plans reviewed before execution same
Attachment size Max 10 attachments / 50MB each / 200MB total per message; filename sanitization; base64 validation apps/desktop/src/renderer/lib/agent-session-orchestrator/adapters/terminal-adapter.ts:46-100
Prompt length Hard-capped per tool definition (e.g., automations: 100K chars) Zod schemas in packages/mcp-v2/src/tools/
Secret stores secrets table (org × project) for encrypted env vars packages/db/src/schema/schema.ts
Quote-injection in shell buildPromptCommandString derives unique heredoc delimiter; never interpolates packages/shared/src/agent-prompt-launch.ts:26-68
Lateral movement between orgs Better Auth activeOrganizationId scoping; tRPC context restricts queries packages/auth/src/server.ts, apps/api/src/trpc/context.ts
Cross-machine command auth Relay verifies JWT before tunneling apps/relay/src/auth.ts
Electric SQL data leakage Cloudflare Worker filters tables by organizationId apps/electric-proxy/src/index.ts
Rate-limiting Not in-tree as a dedicated module — relies on Stripe billing tier + Better Auth + edge proxies Implicit

Note the absence of a built-in PII redactor or content-policy filter — Superset trusts the underlying agent's safety stack and adds workflow guardrails (approvals, hooks) instead of output guardrails. This is appropriate for a developer tool whose users are themselves the audience.


9. The "Skill Preload" Story (Removed Feature)

docs/skill-preload-feature.md documents an LLM feature that didn't ship:

Lesson: the team chose to remove the workaround once upstream caught up rather than keep maintaining a fork.


10. The Two Most Worth-Reading Files

If you only have time for two LLM-related files:

  1. packages/shared/src/agent-prompt-template.ts — small, dense, reads like a spec. It's where the prompt contract between Superset and any agent is defined.
  2. apps/desktop/src/renderer/lib/agent-session-orchestrator/agent-session-orchestrator.ts — the dispatch logic that turns a "user wants an agent to handle this" intent into a running PTY process. It's the seam between Superset's UX and the agent ecosystem.

For wider context after those two:


11. Patterns Worth Stealing

  1. PATH-rewriting + hook injection to instrument an unmodified third-party CLI. ~/.superset/bin/ shims + per-agent hook config files. Cheap, robust, vendor-agnostic.
  2. Per-agent prompt templates with a per-agent dialect (XML for Claude, markdown for Codex). The same LaunchSource[] composes to different surface forms.
  3. scope: "system" | "user" on every context section, with cache-control hints, paving the way for prompt-caching without bespoke per-agent code.
  4. Bytes through IPC, encode at the provider boundary. Don't double-base64 your way through every internal hop.
  5. Heredoc with random-id delimiter when invoking subprocesses with arbitrary text — never --prompt "$(echo $userText)".
  6. MCP tool emitter for audit/observability, set once, applied to every tool.
  7. Resumable SSE via offset+cursor for long agent streams (tab close shouldn't lose state).
  8. UserPromptSubmit hook with veto power as the user-replaceable safety layer, instead of baking policy into core.
  9. auto for base-branch-source classification at the edge; carry the resolved kind through the chain so internal code never re-classifies.