LLM Integration
How Superset leverages Large Language Models — what models, what prompts, what tools, what guardrails, what streaming, what's clever, what's risky.
Superset has two layers of LLM usage:
- Orchestration layer — Superset runs third-party agent CLIs (Claude Code, Codex, Cursor Agent, Gemini CLI, …) as black boxes inside a PTY. The LLM runs inside those agents; Superset is the harness around them.
- Internal layer — Superset also makes LLM calls itself for in-app chat (Mastracode runtime), title generation, observer/reflector flows, and small auxiliary tasks.
Understanding both is essential — and noticing the difference is half the lesson.
1. Models In Use
| Model family | Used for | Where |
|---|---|---|
claude-haiku-4-5-20251001 |
"Small model" tasks: title generation, observer/reflector models in Mastra | packages/chat/src/server/shared/small-model/get-small-model.ts:157-178 |
gpt-4o-mini |
Same — fallback when Anthropic auth not present | same |
| Whatever the user's Claude/Codex/Cursor CLI uses | The actual coding agent — Superset doesn't pick the model, the agent does | each agent's own config (e.g., ~/.claude/) |
Auth resolution order for the small-model: env vars → mastracode storage (API key) → Claude OAuth tokens → OpenAI fallback. This is the file that contains the OAuth-via-Claude-Code path with special headers (get-small-model.ts:157-178).
2. SDKs & Frameworks
| Dependency | Role |
|---|---|
@ai-sdk/anthropic 3.0.43 |
Anthropic provider for Vercel AI SDK |
@ai-sdk/openai 3.0.36 |
OpenAI provider |
@ai-sdk/react 3.0 |
UI streaming hooks (useChat, useObject, etc.) |
ai 6.0 |
Unified Vercel AI SDK |
@mastra/core 1.26.0-alpha.3 |
Mastracode harness — agent runtime with hooks, tool approvals, observer/reflector |
@mastra/mcp |
MCP client that Mastracode uses to talk to Superset's own MCP server |
The presence of both Vercel AI SDK and Mastracode is intentional: Mastra owns the agent loop (system prompts, tool-use, approvals); Vercel AI SDK is used at the edges (UI streaming via @ai-sdk/react, one-shot title generation).
3. Prompt Templates
Superset's own prompts (i.e. those it composes to the agent on the user's behalf) live in packages/shared/src/agent-prompt-template.ts.
3.1 Default terminal task prompt (agent-prompt-template.ts:62-78)
Task: "{{title}}" ({{slug}})
Priority: {{priority}}
Status: {{statusName}}
Labels: {{labels}}
{{description}}
Work in the current workspace. Inspect the relevant code, make the needed changes, verify them when practical, and update task "{{id}}" with a short summary when done.3.2 Default chat task prompt (agent-prompt-template.ts:64+)
Task: "{{title}}" ({{slug}})
Priority: {{priority}}
Status: {{statusName}}
Labels: {{labels}}
{{description}}
Help with this task in the current workspace and take the next concrete step.The terminal version is more directive ("make the needed changes, verify them when practical, update the task when done"). The chat version is conversational ("take the next concrete step").
3.3 Context prompt template (user-level)
{{userPrompt}}
{{tasks}}
{{issues}}
{{prs}}
{{attachments}}Default system template is empty — the agent's own system prompt is preserved. Superset never overrides it. Each agent (Claude/Codex/Cursor) has its own per-agent template that drops in dialect-specific fences (XML for Claude, markdown for Codex/Cursor).
3.4 Title generation prompt (packages/chat/src/server/desktop/title-generation/title-generation.ts:61)
instructions: params.instructions ?? "You generate concise titles."Run by a Mastra Agent against the small model (Haiku or 4o-mini) using agent.generateTitleFromUserMessage({ message, tracingContext }).
3.5 What you won't find
There is no monolithic "system prompt for the coding agent" inside Superset itself. That's deliberate — the coding agents (Claude Code, Codex, etc.) ship with their own system prompts; Superset's role is to deliver context, not to override their personalities. The closest thing to a Superset-authored coding-agent prompt is the task template above, which composes into the agent's user message.
4. Tool Surface (MCP)
Superset exposes MCP tools at three places: the cloud API (apps/api/api/agent/[transport]), the desktop MCP (in-process, for Claude Code / Codex when run locally), and the host-service's mastracode harness (which calls back to the cloud MCP).
4.1 MCP v1 — packages/mcp/src/tools/index.ts
17 tools, organized into:
| Category | Tools |
|---|---|
| Devices/workspaces | listDevices, listWorkspaces, listProjects, getWorkspaceDetails, createWorkspace, switchWorkspace, deleteWorkspace, updateWorkspace, getAppContext |
| Tasks | createTask, updateTask, listTasks, getTask, deleteTask, listTaskStatuses |
| Org | listMembers |
| Agent sessions | startAgentSession |
startAgentSession is the meta tool — it lets one agent dispatch another to a different workspace.
4.2 MCP v2 — packages/mcp-v2/src/tools/register.ts
Adds automations (cron-like recurring agent runs):
| Category | Tools |
|---|---|
| Automations | automations_create, _list, _get, _update, _delete, _pause, _resume, _run, _logs |
automations_get_prompt, automations_set_prompt ← exposes prompt editing to agents |
Plus enhanced workspace/project/host tools.
The promotion of "edit your own automation prompt" to a first-class tool is a notable design choice — it lets a long-running agent rewrite its own future-instance instructions over time.
4.3 Tool definition style (packages/mcp-v2/src/tools/automations/create.ts)
defineTool(server, {
name: "automations_create",
description: "Schedule a recurring agent run...",
inputSchema: {
name: z.string().min(1).max(200),
prompt: z.string().min(1).max(100_000),
agentConfig: z.object({
id: z.string().min(1),
kind: z.enum(["terminal", "chat"]),
}).passthrough(),
rrule: z.string(), // RFC 5545
// … timezone, mcpScope, etc.
},
handler: async (input, ctx) => caller.automation.create(input),
});Patterns to notice:
- Zod schemas double as JSON Schema (via
zod-to-json-schema) for the MCP advert. - Hard caps on prompt length (100K chars), name length (200), and so on.
- The handler is a thin shim over a tRPC caller — same business logic for tRPC and MCP.
4.4 Tool registration with emitter (packages/mcp-v2/src/tools/register.ts:57-65)
export function registerTools(server: McpServer, options?: RegisterToolsOptions): void {
setServerToolCallEmitter(server, options?.onToolCall);
for (const mod of REGISTRARS) mod.register(server);
}The emitter injects an audit hook — every tool call goes through it, allowing logging/usage tracking without touching individual tools.
5. Agent Presets — Launching The Real CLIs
packages/host-service/src/trpc/router/settings/agent-presets.ts:31-126 is where the matrix of supported agents lives:
{ presetId: "claude", label: "Claude",
command: "claude", args: ["--permission-mode", "acceptEdits"],
promptTransport: "argv", promptArgs: [] },
{ presetId: "codex", label: "Codex",
command: "codex", args: ["-c", 'model_reasoning_effort="high"', …],
promptTransport: "argv", promptArgs: ["--"] },
{ presetId: "gemini", label: "Gemini",
command: "gemini", args: ["--approval-mode=auto_edit"],
promptTransport: "argv", promptArgs: [] },
{ presetId: "mastracode", label: "Mastracode",
command: "mastracode", promptTransport: "argv", promptArgs: ["--prompt"] },
// + claude-yolo, codex-yolo, opencode, cursor-agent, copilot, amp, pi, …promptTransport: either "argv" (the prompt is a CLI arg) or "stdin" (the prompt is piped). promptArgs are the flags that introduce the prompt arg.
The "yolo" variants are looser-permission profiles (e.g., --dangerously-skip-permissions) — kept as separate presets, not flag toggles, so the user has to opt in deliberately.
5.1 Wrapper hook injection (apps/desktop/src/main/lib/agent-setup/)
For each agent type Superset writes a config file with hooks:
const DESKTOP_AGENT_SETUP_RUNNERS: Record<DesktopAgentSetupAction, () => void> = {
"claude-wrapper": createClaudeWrapper,
"codex-wrapper": createCodexWrapper,
"cursor-agent-wrapper": createCursorAgentWrapper,
"gemini-wrapper": createGeminiWrapper,
"mastra-wrapper": createMastraWrapper,
"copilot-hook-script": createCopilotHookScript,
// 21 total
};For Claude Code: ~/.claude/settings.json is patched to add a managed hook block:
[ -n "$SUPERSET_HOME_DIR" ] && [ -x "$SUPERSET_HOME_DIR/hooks/notify" ] && "$SUPERSET_HOME_DIR/hooks/notify" || trueThe block is rewritten on each launch (idempotent, marker-fenced) so users editing their own settings.json doesn't break Superset and vice versa.
This pattern — PATH rewrite + hook config + binary unchanged — is the clever bit of the orchestration architecture.
6. Mastracode Harness — Internal Chat Runtime
packages/chat/src/server/trpc/service.ts:161-202 shows initialization:
const omModel = resolveOmModelFromAuth();
const runtime = await createMastraCode({
cwd: runtimeCwd,
extraTools,
disableMcp: !ENABLE_MASTRA_MCP_SERVERS, // currently false
...(omModel && {
initialState: {
observerModelId: omModel,
reflectorModelId: omModel,
},
}),
});
runtime.harness.init();
runtime.harness.selectOrCreateThread();Harness API:
sendMessage(payload)— submit messages with optional filesrespondToQuestion(payload)— answer sandbox questionsrespondToToolApproval(decision)— approve/decline tool userespondToPlanApproval(response)— accept/reject planslistMessages()— historygetDisplayState()— pending questions/approvals/errors
6.1 MCP injection from Superset
packages/chat/src/server/trpc/utils/runtime/superset-mcp.ts:1-43:
import { MCPClient } from "@mastra/mcp";
export async function getSupersetMcpTools(headers, apiUrl) {
const client = new MCPClient({
id: `superset-mcp-${Date.now()}`,
servers: {
superset: {
url: new URL(`${apiUrl}/api/agent/mcp`),
fetch: async (url, init) => {
const merged = new Headers(init?.headers);
for (const [k, v] of Object.entries(await headers())) merged.set(k, v);
return fetch(url, { ...init, headers: merged });
},
},
},
});
return (await client.listTools()) as Record<string, MastraExtraTool>;
}The harness gets Superset's tools (tasks, workspaces, automations, …) and surfaces them to the LLM. Auth is forwarded via headers().
6.2 Hooks (guardrails)
packages/chat/src/server/trpc/utils/runtime/runtime.ts:135-144 shows the user-prompt-submit gate:
export async function onUserPromptSubmit(runtime, userMessage): Promise<void> {
if (!runtime.hookManager) return;
const result = await runtime.hookManager.runUserPromptSubmit(userMessage);
if (!result.allowed) {
throw new Error(result.blockReason ?? "Blocked by UserPromptSubmit hook");
}
}Available lifecycle hooks: SessionStart, UserPromptSubmit, Stop, SessionEnd — all manageable from .mastra/hooks/ config (or .claude/hooks/ for Claude Code-style hook reuse).
Hooks can:
- Block prompts (above)
- Inject context
- Notify external systems
- Veto tool use
The hooks layer is the main programmable guardrail. Pre-run rate limiting, content filters, secret-redaction — all live as hooks rather than hard-coded in Superset.
7. Streaming
7.1 Chat streaming — Durable Streams
apps/api/src/app/api/chat/lib.ts:
export const PROTOCOL_QUERY_PARAMS = ["offset", "live", "cursor"];
export const PROTOCOL_RESPONSE_HEADERS = [
"stream-next-offset", "stream-cursor", "stream-up-to-date", "stream-closed",
"content-type", "cache-control", "etag",
];
export function streamUrl(sessionId: string) {
return `${env.DURABLE_STREAMS_URL}/sessions/${sessionId}`;
}
export async function appendToStream(sessionId, event) {
const response = await fetch(streamUrl(sessionId), {
method: "POST",
headers: { Authorization: `Bearer ${env.DURABLE_STREAMS_SECRET}`, "Content-Type": "application/json" },
body: event,
});
if (!response.ok) throw new Error(`Stream append failed: ${response.status}`);
}Durable Streams provide a resumable SSE pipeline: each event has an offset, the client tracks stream-next-offset, on reconnect it resumes from there. Tab close, network blip, even server-side restart — the stream picks up.
apps/api/src/app/api/chat/[sessionId]/stream/route.ts:18-72 is the GET proxy: validates auth, forwards offset/live/cursor, returns 204 if up-to-date, otherwise streams.
7.2 Terminal streaming — relay tunnel
Different problem, different transport. PTY frames are bursty, bidirectional, and high-volume. They go through apps/relay/src/index.ts:66-174 — a Hono WebSocket app on Fly.io:
app.get("/tunnel", upgradeWebSocket((c) => {
const hostId = c.req.query("hostId");
return {
onOpen: async (_e, ws) => tunnelManager.register(hostId, token, ws),
onMessage: (event) => tunnelManager.handleMessage(hostId, event.data),
onClose: () => tunnelManager.unregister(hostId),
};
}));The relay is not for LLM tokens; it's for terminal frames (and other host-routed traffic).
8. Guardrails
A snapshot of every place safety is enforced:
| Concern | Enforcement | Where |
|---|---|---|
| Block disallowed prompts | UserPromptSubmit hook → throws if result.allowed === false |
packages/chat/src/server/trpc/utils/runtime/runtime.ts:135-144 |
| Tool approvals (sandboxing) | Mastracode respondToToolApproval flow surfaces tool calls to the user before execution |
@mastra/core |
| Plan approvals | respondToPlanApproval — multi-step plans reviewed before execution |
same |
| Attachment size | Max 10 attachments / 50MB each / 200MB total per message; filename sanitization; base64 validation | apps/desktop/src/renderer/lib/agent-session-orchestrator/adapters/terminal-adapter.ts:46-100 |
| Prompt length | Hard-capped per tool definition (e.g., automations: 100K chars) | Zod schemas in packages/mcp-v2/src/tools/ |
| Secret stores | secrets table (org × project) for encrypted env vars |
packages/db/src/schema/schema.ts |
| Quote-injection in shell | buildPromptCommandString derives unique heredoc delimiter; never interpolates |
packages/shared/src/agent-prompt-launch.ts:26-68 |
| Lateral movement between orgs | Better Auth activeOrganizationId scoping; tRPC context restricts queries |
packages/auth/src/server.ts, apps/api/src/trpc/context.ts |
| Cross-machine command auth | Relay verifies JWT before tunneling | apps/relay/src/auth.ts |
| Electric SQL data leakage | Cloudflare Worker filters tables by organizationId |
apps/electric-proxy/src/index.ts |
| Rate-limiting | Not in-tree as a dedicated module — relies on Stripe billing tier + Better Auth + edge proxies | Implicit |
Note the absence of a built-in PII redactor or content-policy filter — Superset trusts the underlying agent's safety stack and adds workflow guardrails (approvals, hooks) instead of output guardrails. This is appropriate for a developer tool whose users are themselves the audience.
9. The "Skill Preload" Story (Removed Feature)
docs/skill-preload-feature.md documents an LLM feature that didn't ship:
- Idea: extract
/commandchips from the user's message and pre-attach matching skills (skill files in.mastracode/skills/or.claude/skills/) as hints to the agent. - Implementation: required a Mastra fork to accept
preloadSkillsmetadata. - Why removed: upstream Mastra 1.26.0-alpha+ ships
search_skillsandload_skillas native tools — agents discover skills autonomously.
Lesson: the team chose to remove the workaround once upstream caught up rather than keep maintaining a fork.
10. The Two Most Worth-Reading Files
If you only have time for two LLM-related files:
packages/shared/src/agent-prompt-template.ts— small, dense, reads like a spec. It's where the prompt contract between Superset and any agent is defined.apps/desktop/src/renderer/lib/agent-session-orchestrator/agent-session-orchestrator.ts— the dispatch logic that turns a "user wants an agent to handle this" intent into a running PTY process. It's the seam between Superset's UX and the agent ecosystem.
For wider context after those two:
packages/chat/src/server/trpc/service.ts— chat runtime wiringpackages/host-service/src/trpc/router/settings/agent-presets.ts— the agent matrixpackages/mcp-v2/src/tools/register.ts— tool cataloguepackages/chat/src/server/trpc/utils/runtime/superset-mcp.ts— MCP injection bridgeapps/desktop/src/main/lib/agent-setup/agent-wrappers-claude-codex-opencode.ts— hook injection mechanics
11. Patterns Worth Stealing
- PATH-rewriting + hook injection to instrument an unmodified third-party CLI.
~/.superset/bin/shims + per-agent hook config files. Cheap, robust, vendor-agnostic. - Per-agent prompt templates with a per-agent dialect (XML for Claude, markdown for Codex). The same
LaunchSource[]composes to different surface forms. scope: "system" | "user"on every context section, with cache-control hints, paving the way for prompt-caching without bespoke per-agent code.- Bytes through IPC, encode at the provider boundary. Don't double-base64 your way through every internal hop.
- Heredoc with random-id delimiter when invoking subprocesses with arbitrary text — never
--prompt "$(echo $userText)". - MCP tool emitter for audit/observability, set once, applied to every tool.
- Resumable SSE via offset+cursor for long agent streams (tab close shouldn't lose state).
UserPromptSubmithook with veto power as the user-replaceable safety layer, instead of baking policy into core.autofor base-branch-source classification at the edge; carry the resolved kind through the chain so internal code never re-classifies.