Multica Repository Analysis: LLM Integration & Agent System
How LLMs Are Used
Multica does not call LLM APIs directly. Instead, it orchestrates local coding agent CLIs (Claude Code, Codex, Copilot, etc.) as subprocess workers. The LLM interaction happens inside the agent CLIs; Multica manages the lifecycle around them.
Multica Server Daemon Agent CLI LLM API
| | | |
|--create task----------->| | |
| |--spawn process---------->| |
| | |--API call------->|
| | |<--response-------|
| |<--stream-json events-----| |
|<--progress/messages-----| | |
| |<--result-----------------| |
|<--complete/usage--------| | |
This design means Multica is agent-agnostic -- it doesn't need API keys for any LLM provider. The agents use whatever credentials are configured on the user's machine.
Supported Agent Providers
Defined in server/pkg/agent/agent.go:97-126:
| Provider | CLI | Protocol | Model Discovery |
|---|---|---|---|
| Claude | claude |
stream-json (stdin/stdout) | Static catalog |
| Codex | codex |
app-server (HTTP) | Static catalog |
| Copilot | copilot |
JSON (stdin/stdout) | Static catalog |
| OpenCode | opencode |
JSON run mode | Dynamic (opencode models) |
| OpenClaw | openclaw |
JSON agent mode | Dynamic (openclaw agents list) |
| Hermes | hermes |
ACP (JSON-RPC stdio) | Dynamic (ACP session/new) |
| Gemini | gemini |
stream-json (stdin/stdout) | Static catalog |
| Pi | pi |
JSON mode | Dynamic (pi --list-models) |
| Cursor | cursor-agent |
stream-json (stdin/stdout) | Dynamic (cursor-agent --list-models) |
| Kimi | kimi |
ACP (JSON-RPC stdio) | Dynamic (ACP session/new) |
Agent Backend Interface
server/pkg/agent/agent.go:15-86:
// The unified contract every agent must implement
type Backend interface {
Execute(ctx context.Context, prompt string, opts ExecOptions) (*Session, error)
}
// Configuration for a single execution
type ExecOptions struct {
Cwd string // Working directory
Model string // LLM model override
SystemPrompt string // Appended system prompt
MaxTurns int // Maximum agent turns
Timeout time.Duration // Execution timeout
ResumeSessionID string // Resume previous session
CustomArgs []string // User-configured CLI args
McpConfig json.RawMessage // MCP server configuration
}
// Streaming results
type Session struct {
Messages <-chan Message // Streaming events
Result <-chan Result // Final outcome (exactly one)
}
// Unified event types from any agent
type Message struct {
Type MessageType // text, thinking, tool-use, tool-result, status, error, log
Content string
Tool string // Tool name (for tool-use/tool-result)
CallID string // Tool call ID
Input map[string]any
Output string
Status string
SessionID string // For session resume pinning
}
// Final execution result
type Result struct {
Status string // completed, failed, aborted, timeout
Output string // Accumulated text output
Error string
DurationMs int64
SessionID string // For future resume
Usage map[string]TokenUsage // Per-model token tracking
}Claude Code Integration Deep Dive
The Claude integration (server/pkg/agent/claude.go) is the most detailed implementation and demonstrates the full pattern.
Launch Configuration (:394-417)
func buildClaudeArgs(opts ExecOptions, logger *slog.Logger) []string {
args := []string{
"-p", // Non-interactive (pipe) mode
"--output-format", "stream-json", // Structured output
"--input-format", "stream-json", // Structured input
"--verbose", // Full event stream
"--strict-mcp-config", // Only use provided MCP servers
"--permission-mode", "bypassPermissions", // Autonomous operation
}
// Optional: --model, --max-turns, --append-system-prompt, --resume
// Plus filtered custom_args from user configuration
}Input Protocol (:430-448)
Prompts are sent as structured JSON via stdin:
{
"type": "user",
"message": {
"role": "user",
"content": [{"type": "text", "text": "<prompt>"}]
}
}Output Parsing (:133-173)
The daemon reads newline-delimited JSON from stdout:
| Message Type | Handling |
|---|---|
assistant |
Parse content blocks (text, thinking, tool_use), accumulate token usage |
user |
Parse tool_result blocks |
system |
Extract session_id for resume |
result |
Final output text, error status |
log |
Forward to daemon logger |
Token Usage Tracking (:220-228)
Per-model accumulation:
u := usage[content.Model]
u.InputTokens += content.Usage.InputTokens
u.OutputTokens += content.Usage.OutputTokens
u.CacheReadTokens += content.Usage.CacheReadInputTokens
u.CacheWriteTokens += content.Usage.CacheCreationInputTokens
usage[content.Model] = uMCP Config Injection (:44-53)
When an agent has MCP server configuration, it's written to a temp file and passed via --mcp-config. This gives agents access to configured MCP tools (e.g., database access, web search) in a controlled way.
Model Catalog & Discovery
server/pkg/agent/models.go:
Static Catalogs
// Claude (`:137-145`)
{ID: "claude-sonnet-4-6", Label: "Claude Sonnet 4.6", Default: true}
{ID: "claude-opus-4-7", Label: "Claude Opus 4.7"}
{ID: "claude-haiku-4-5-20251001", Label: "Claude Haiku 4.5"}
// Codex (`:147-155`)
{ID: "gpt-5.4", Label: "GPT-5.4", Default: true}
{ID: "gpt-5.4-mini", Label: "GPT-5.4 mini"}
{ID: "o3", Label: "o3"}
// Gemini (`:168-181`)
{ID: "auto", Label: "Auto (Gemini 3)", Default: true}
{ID: "pro", Label: "Pro"}
{ID: "flash", Label: "Flash"}Dynamic Discovery (:56-93)
For providers with evolving catalogs, models are discovered at runtime with 60-second caching:
func ListModels(ctx, providerType, executablePath) ([]Model, error) {
switch providerType {
case "claude": return claudeStaticModels(), nil
case "cursor": return cachedDiscovery("cursor", discoverCursorModels)
case "hermes": return cachedDiscovery("hermes", discoverHermesModels)
// ...
}
}ACP Protocol Discovery (:386-502)
For Hermes and Kimi (ACP-based agents), model discovery uses a minimal JSON-RPC handshake:
- Launch
hermes acp/kimi acpin a throwaway temp directory - Send
initializemessage (protocol version, client info) - Send
session/newmessage - Parse
models.availableModelsfrom response - Kill process, return models
Prompt Engineering
Four Prompt Types (server/internal/daemon/prompt.go)
1. Assignment Prompt (:23-27)
You are running as a local coding agent for a Multica workspace.
Your assigned issue ID is: <uuid>
Start by running `multica issue get <uuid> --output json` to understand
your task, then complete it.
2. Comment-Triggered Prompt (:36-58)
You are running as a local coding agent for a Multica workspace.
Your assigned issue ID is: <uuid>
[NEW COMMENT] A user just left a new comment. Focus on THIS comment:
> <comment content>
Start by running `multica issue get <uuid> --output json`...
With agent-to-agent loop prevention:
Warning: The triggering comment was posted by another agent. Before
replying, decide whether a reply is warranted at all. If that comment
was an acknowledgment, thanks, or sign-off and no concrete question
or task is being asked of you, do NOT reply.
3. Chat Prompt (:61-67)
You are running as a chat assistant for a Multica workspace.
A user is chatting with you directly. Respond to their message.
User message:
<message content>
4. Autopilot Prompt (:70-103)
You are running as a local coding agent for a Multica workspace.
This task was triggered by an Autopilot in run-only mode. There is no
assigned Multica issue for this run.
Autopilot run ID: <uuid>
Autopilot title: <title>
Trigger source: schedule
Autopilot instructions:
<description>
Meta Skill Content (server/internal/daemon/execenv/runtime_config.go:41-242)
This is the comprehensive context document injected into every agent's working directory. It's the most important prompt engineering artifact in the system.
Structure:
- Header: "You are a coding agent in the Multica platform"
- Agent Identity (name, ID, custom instructions)
- CLI Command Reference (read + write commands)
- Available Repositories (with checkout instructions)
- Workflow mode instructions (chat/autopilot/comment/assignment)
- Skills listing
- Mention system with loop prevention rules
- Attachment handling
- Output requirements
Key prompt engineering techniques used:
- Tool documentation: Full CLI reference with examples so agents know how to interact
- Mode-specific behavior: Different instruction sets for chat vs issue work
- Negative instructions: "Do NOT use curl/wget", "Do NOT change status unless asked"
- Result delivery mandate: "Final results MUST be delivered via
multica issue comment add" - Loop prevention: Detailed rules about when to reply and when silence is better
Guardrails & Safety Measures
1. PII/Secret Redaction (server/pkg/redact/redact.go:19-52)
Applied to agent output before database storage and WebSocket broadcast:
| Pattern | Example | Replacement |
|---|---|---|
| AWS access keys | AKIA... |
[REDACTED AWS KEY] |
| AWS secrets | aws_secret_access_key=... |
[REDACTED AWS SECRET] |
| PEM private keys | -----BEGIN PRIVATE KEY----- |
[REDACTED PRIVATE KEY] |
| GitHub tokens | ghp_... |
[REDACTED GITHUB TOKEN] |
| OpenAI/Anthropic keys | sk-... |
[REDACTED API KEY] |
| Slack tokens | xoxb-... |
[REDACTED SLACK TOKEN] |
| GitLab PATs | glpat-... |
[REDACTED GITLAB TOKEN] |
| JWTs | eyJ... |
[REDACTED JWT] |
| Bearer tokens | Bearer ... |
Bearer [REDACTED] |
| Connection strings | postgres://user:pass@... |
[REDACTED CONNECTION STRING]@ |
| Generic secrets | API_KEY=... |
[REDACTED CREDENTIAL] |
| Home directory paths | /Users/john/... |
/Users/****/... |
2. Protocol-Critical Arg Blocking (server/pkg/agent/claude.go:386-392)
User-configured custom_args cannot override these flags:
var claudeBlockedArgs = map[string]blockedArgMode{
"-p": blockedStandalone, // Must stay non-interactive
"--output-format": blockedWithValue, // Must be stream-json
"--input-format": blockedWithValue, // Must be stream-json
"--permission-mode": blockedWithValue, // Must be bypassPermissions
"--mcp-config": blockedWithValue, // Controlled by daemon
}3. Child Environment Filtering (server/pkg/agent/claude.go:483-487)
Claude Code env vars are stripped from child processes to prevent nested session inheritance.
4. Timeout Enforcement
- Default timeout: 20 minutes per execution (Claude,
:33) - Configurable via
MULTICA_CLAUDE_TIMEOUTetc. - Max concurrent tasks: 20 per daemon (configurable)
- Context cancellation propagated to child process
5. Agent Loop Prevention
Multi-layered defense against infinite agent-to-agent conversation loops:
In the prompt (prompt.go:52): Explicit warning when comment author is another agent
In meta skill (runtime_config.go:161-162, 201-213): Detailed rules about when NOT to mention other agents
Rule of thumb: "If you are unsure whether a mention is warranted, don't mention. Silence ends conversations; @ restarts them."
6. Rate Limiting
- Feedback submissions: per-user hourly limit (
handler/feedback.go:66-72) - Model discovery caching: 60s TTL prevents CLI spam (
models.go:46)
7. Signup Control
Configurable via environment variables:
ALLOW_SIGNUP: Enable/disable new registrationsALLOWED_EMAILS: Whitelist specific emailsALLOWED_EMAIL_DOMAINS: Whitelist email domains
8. CSRF Protection
Cookie-based auth requires X-CSRF-Token header for state-changing methods (POST, PUT, PATCH, DELETE) via middleware/auth.go.
Token Usage Tracking
Token consumption is tracked end-to-end:
- Agent CLI reports per-model usage in its output stream
- Daemon parses usage from the result (
daemon.go:1225-1235) - Daemon reports usage via
POST /api/daemon/tasks/{id}/usagewith per-model breakdown - Server stores in
task_usagetable (per-provider, per-model) - Frontend displays usage via
/api/usage/dailyand/api/usage/summaryendpoints
Token fields tracked per model:
- Input tokens
- Output tokens
- Cache read tokens (prompt caching efficiency)
- Cache write tokens (prompt caching cost)
Usage merging on task resume (daemon.go:1477-1497): When a task resumes a previous session, token usage from both runs is accumulated.
Session Resume
The system supports resuming agent sessions across task runs:
- Agent emits
session_idin its status events - Daemon pins the session via
POST /api/daemon/tasks/{id}/session - On re-run (e.g.,
RerunIssue), the previous session ID is passed asprior_session_id - The agent receives
--resume <session_id>to continue where it left off - If resume fails (session not found), the
resolveSessionIDfunction (claude.go:457-462) returns empty to trigger fresh-session fallback
What's NOT in the LLM Integration
Notably absent:
- No direct API calls: Multica never calls OpenAI/Anthropic/Google APIs directly
- No prompt caching logic: That's handled by the agent CLIs themselves
- No fine-tuning: Uses off-the-shelf models via CLI wrappers
- No RAG/embeddings: Despite pgvector being available, it's not currently used for LLM features
- No content moderation: Relies on the underlying LLM provider's safety measures
- No token budgets: No per-workspace or per-agent token spending limits (usage is tracked but not capped)