Multica Repository Analysis: LLM Integration & Agent System

How LLMs Are Used

Multica does not call LLM APIs directly. Instead, it orchestrates local coding agent CLIs (Claude Code, Codex, Copilot, etc.) as subprocess workers. The LLM interaction happens inside the agent CLIs; Multica manages the lifecycle around them.

Multica Server               Daemon                    Agent CLI          LLM API
     |                         |                          |                  |
     |--create task----------->|                          |                  |
     |                         |--spawn process---------->|                  |
     |                         |                          |--API call------->|
     |                         |                          |<--response-------|
     |                         |<--stream-json events-----|                  |
     |<--progress/messages-----|                          |                  |
     |                         |<--result-----------------|                  |
     |<--complete/usage--------|                          |                  |

This design means Multica is agent-agnostic -- it doesn't need API keys for any LLM provider. The agents use whatever credentials are configured on the user's machine.

Supported Agent Providers

Defined in server/pkg/agent/agent.go:97-126:

Provider	CLI	Protocol	Model Discovery
Claude	`claude`	stream-json (stdin/stdout)	Static catalog
Codex	`codex`	app-server (HTTP)	Static catalog
Copilot	`copilot`	JSON (stdin/stdout)	Static catalog
OpenCode	`opencode`	JSON run mode	Dynamic (`opencode models`)
OpenClaw	`openclaw`	JSON agent mode	Dynamic (`openclaw agents list`)
Hermes	`hermes`	ACP (JSON-RPC stdio)	Dynamic (ACP `session/new`)
Gemini	`gemini`	stream-json (stdin/stdout)	Static catalog
Pi	`pi`	JSON mode	Dynamic (`pi --list-models`)
Cursor	`cursor-agent`	stream-json (stdin/stdout)	Dynamic (`cursor-agent --list-models`)
Kimi	`kimi`	ACP (JSON-RPC stdio)	Dynamic (ACP `session/new`)

Agent Backend Interface

server/pkg/agent/agent.go:15-86:

// The unified contract every agent must implement
type Backend interface {
    Execute(ctx context.Context, prompt string, opts ExecOptions) (*Session, error)
}
 
// Configuration for a single execution
type ExecOptions struct {
    Cwd             string          // Working directory
    Model           string          // LLM model override
    SystemPrompt    string          // Appended system prompt
    MaxTurns        int             // Maximum agent turns
    Timeout         time.Duration   // Execution timeout
    ResumeSessionID string          // Resume previous session
    CustomArgs      []string        // User-configured CLI args
    McpConfig       json.RawMessage // MCP server configuration
}
 
// Streaming results
type Session struct {
    Messages <-chan Message  // Streaming events
    Result   <-chan Result   // Final outcome (exactly one)
}
 
// Unified event types from any agent
type Message struct {
    Type      MessageType  // text, thinking, tool-use, tool-result, status, error, log
    Content   string
    Tool      string       // Tool name (for tool-use/tool-result)
    CallID    string       // Tool call ID
    Input     map[string]any
    Output    string
    Status    string
    SessionID string       // For session resume pinning
}
 
// Final execution result
type Result struct {
    Status     string              // completed, failed, aborted, timeout
    Output     string              // Accumulated text output
    Error      string
    DurationMs int64
    SessionID  string              // For future resume
    Usage      map[string]TokenUsage // Per-model token tracking
}

Claude Code Integration Deep Dive

The Claude integration (server/pkg/agent/claude.go) is the most detailed implementation and demonstrates the full pattern.

Launch Configuration (`:394-417`)

func buildClaudeArgs(opts ExecOptions, logger *slog.Logger) []string {
    args := []string{
        "-p",                              // Non-interactive (pipe) mode
        "--output-format", "stream-json",  // Structured output
        "--input-format", "stream-json",   // Structured input
        "--verbose",                       // Full event stream
        "--strict-mcp-config",             // Only use provided MCP servers
        "--permission-mode", "bypassPermissions", // Autonomous operation
    }
    // Optional: --model, --max-turns, --append-system-prompt, --resume
    // Plus filtered custom_args from user configuration
}

Input Protocol (`:430-448`)

Prompts are sent as structured JSON via stdin:

{
  "type": "user",
  "message": {
    "role": "user",
    "content": [{"type": "text", "text": "<prompt>"}]
  }
}

Output Parsing (`:133-173`)

The daemon reads newline-delimited JSON from stdout:

Message Type	Handling
`assistant`	Parse content blocks (text, thinking, tool_use), accumulate token usage
`user`	Parse tool_result blocks
`system`	Extract session_id for resume
`result`	Final output text, error status
`log`	Forward to daemon logger

Token Usage Tracking (`:220-228`)

Per-model accumulation:

u := usage[content.Model]
u.InputTokens += content.Usage.InputTokens
u.OutputTokens += content.Usage.OutputTokens
u.CacheReadTokens += content.Usage.CacheReadInputTokens
u.CacheWriteTokens += content.Usage.CacheCreationInputTokens
usage[content.Model] = u

MCP Config Injection (`:44-53`)

When an agent has MCP server configuration, it's written to a temp file and passed via --mcp-config. This gives agents access to configured MCP tools (e.g., database access, web search) in a controlled way.

Model Catalog & Discovery

server/pkg/agent/models.go:

Static Catalogs

// Claude (`:137-145`)
{ID: "claude-sonnet-4-6", Label: "Claude Sonnet 4.6", Default: true}
{ID: "claude-opus-4-7", Label: "Claude Opus 4.7"}
{ID: "claude-haiku-4-5-20251001", Label: "Claude Haiku 4.5"}
 
// Codex (`:147-155`)
{ID: "gpt-5.4", Label: "GPT-5.4", Default: true}
{ID: "gpt-5.4-mini", Label: "GPT-5.4 mini"}
{ID: "o3", Label: "o3"}
 
// Gemini (`:168-181`)
{ID: "auto", Label: "Auto (Gemini 3)", Default: true}
{ID: "pro", Label: "Pro"}
{ID: "flash", Label: "Flash"}

Dynamic Discovery (`:56-93`)

For providers with evolving catalogs, models are discovered at runtime with 60-second caching:

func ListModels(ctx, providerType, executablePath) ([]Model, error) {
    switch providerType {
    case "claude":  return claudeStaticModels(), nil
    case "cursor":  return cachedDiscovery("cursor", discoverCursorModels)
    case "hermes":  return cachedDiscovery("hermes", discoverHermesModels)
    // ...
    }
}

ACP Protocol Discovery (`:386-502`)

For Hermes and Kimi (ACP-based agents), model discovery uses a minimal JSON-RPC handshake:

Launch hermes acp / kimi acp in a throwaway temp directory
Send initialize message (protocol version, client info)
Send session/new message
Parse models.availableModels from response
Kill process, return models

Prompt Engineering

Four Prompt Types (`server/internal/daemon/prompt.go`)

1. Assignment Prompt (:23-27)

You are running as a local coding agent for a Multica workspace.

Your assigned issue ID is: <uuid>

Start by running `multica issue get <uuid> --output json` to understand
your task, then complete it.

2. Comment-Triggered Prompt (:36-58)

You are running as a local coding agent for a Multica workspace.

Your assigned issue ID is: <uuid>

[NEW COMMENT] A user just left a new comment. Focus on THIS comment:

> <comment content>

Start by running `multica issue get <uuid> --output json`...

With agent-to-agent loop prevention:

Warning: The triggering comment was posted by another agent. Before
replying, decide whether a reply is warranted at all. If that comment
was an acknowledgment, thanks, or sign-off and no concrete question
or task is being asked of you, do NOT reply.

3. Chat Prompt (:61-67)

You are running as a chat assistant for a Multica workspace.
A user is chatting with you directly. Respond to their message.

User message:
<message content>

4. Autopilot Prompt (:70-103)

You are running as a local coding agent for a Multica workspace.

This task was triggered by an Autopilot in run-only mode. There is no
assigned Multica issue for this run.

Autopilot run ID: <uuid>
Autopilot title: <title>
Trigger source: schedule

Autopilot instructions:
<description>

Meta Skill Content (`server/internal/daemon/execenv/runtime_config.go:41-242`)

This is the comprehensive context document injected into every agent's working directory. It's the most important prompt engineering artifact in the system.

Structure:

Header: "You are a coding agent in the Multica platform"
Agent Identity (name, ID, custom instructions)
CLI Command Reference (read + write commands)
Available Repositories (with checkout instructions)
Workflow mode instructions (chat/autopilot/comment/assignment)
Skills listing
Mention system with loop prevention rules
Attachment handling
Output requirements

Key prompt engineering techniques used:

Tool documentation: Full CLI reference with examples so agents know how to interact
Mode-specific behavior: Different instruction sets for chat vs issue work
Negative instructions: "Do NOT use curl/wget", "Do NOT change status unless asked"
Result delivery mandate: "Final results MUST be delivered via multica issue comment add"
Loop prevention: Detailed rules about when to reply and when silence is better

Guardrails & Safety Measures

1. PII/Secret Redaction (`server/pkg/redact/redact.go:19-52`)

Applied to agent output before database storage and WebSocket broadcast:

Pattern	Example	Replacement
AWS access keys	`AKIA...`	`[REDACTED AWS KEY]`
AWS secrets	`aws_secret_access_key=...`	`[REDACTED AWS SECRET]`
PEM private keys	`-----BEGIN PRIVATE KEY-----`	`[REDACTED PRIVATE KEY]`
GitHub tokens	`ghp_...`	`[REDACTED GITHUB TOKEN]`
OpenAI/Anthropic keys	`sk-...`	`[REDACTED API KEY]`
Slack tokens	`xoxb-...`	`[REDACTED SLACK TOKEN]`
GitLab PATs	`glpat-...`	`[REDACTED GITLAB TOKEN]`
JWTs	`eyJ...`	`[REDACTED JWT]`
Bearer tokens	`Bearer ...`	`Bearer [REDACTED]`
Connection strings	`postgres://user:pass@...`	`[REDACTED CONNECTION STRING]@`
Generic secrets	`API_KEY=...`	`[REDACTED CREDENTIAL]`
Home directory paths	`/Users/john/...`	`/Users/****/...`

2. Protocol-Critical Arg Blocking (`server/pkg/agent/claude.go:386-392`)

User-configured custom_args cannot override these flags:

var claudeBlockedArgs = map[string]blockedArgMode{
    "-p":                blockedStandalone,  // Must stay non-interactive
    "--output-format":   blockedWithValue,   // Must be stream-json
    "--input-format":    blockedWithValue,   // Must be stream-json
    "--permission-mode": blockedWithValue,   // Must be bypassPermissions
    "--mcp-config":      blockedWithValue,   // Controlled by daemon
}

3. Child Environment Filtering (`server/pkg/agent/claude.go:483-487`)

Claude Code env vars are stripped from child processes to prevent nested session inheritance.

4. Timeout Enforcement

Default timeout: 20 minutes per execution (Claude, :33)
Configurable via MULTICA_CLAUDE_TIMEOUT etc.
Max concurrent tasks: 20 per daemon (configurable)
Context cancellation propagated to child process

5. Agent Loop Prevention

Multi-layered defense against infinite agent-to-agent conversation loops:

In the prompt (prompt.go:52): Explicit warning when comment author is another agent In meta skill (runtime_config.go:161-162, 201-213): Detailed rules about when NOT to mention other agents Rule of thumb: "If you are unsure whether a mention is warranted, don't mention. Silence ends conversations; @ restarts them."

6. Rate Limiting

Feedback submissions: per-user hourly limit (handler/feedback.go:66-72)
Model discovery caching: 60s TTL prevents CLI spam (models.go:46)

Configurable via environment variables:

ALLOW_SIGNUP: Enable/disable new registrations
ALLOWED_EMAILS: Whitelist specific emails
ALLOWED_EMAIL_DOMAINS: Whitelist email domains

8. CSRF Protection

Cookie-based auth requires X-CSRF-Token header for state-changing methods (POST, PUT, PATCH, DELETE) via middleware/auth.go.

Token Usage Tracking

Token consumption is tracked end-to-end:

Agent CLI reports per-model usage in its output stream
Daemon parses usage from the result (daemon.go:1225-1235)
Daemon reports usage via POST /api/daemon/tasks/{id}/usage with per-model breakdown
Server stores in task_usage table (per-provider, per-model)
Frontend displays usage via /api/usage/daily and /api/usage/summary endpoints

Token fields tracked per model:

Input tokens
Output tokens
Cache read tokens (prompt caching efficiency)
Cache write tokens (prompt caching cost)

Usage merging on task resume (daemon.go:1477-1497): When a task resumes a previous session, token usage from both runs is accumulated.

Session Resume

The system supports resuming agent sessions across task runs:

Agent emits session_id in its status events
Daemon pins the session via POST /api/daemon/tasks/{id}/session
On re-run (e.g., RerunIssue), the previous session ID is passed as prior_session_id
The agent receives --resume <session_id> to continue where it left off
If resume fails (session not found), the resolveSessionID function (claude.go:457-462) returns empty to trigger fresh-session fallback

What's NOT in the LLM Integration

Notably absent:

No direct API calls: Multica never calls OpenAI/Anthropic/Google APIs directly
No prompt caching logic: That's handled by the agent CLIs themselves
No fine-tuning: Uses off-the-shelf models via CLI wrappers
No RAG/embeddings: Despite pgvector being available, it's not currently used for LLM features
No content moderation: Relies on the underlying LLM provider's safety measures
No token budgets: No per-workspace or per-agent token spending limits (usage is tracked but not capped)