CodeDocs Vault

Lessons for Agentic Framework Builders

Transferable patterns and design decisions from NanoClaw, distilled for anyone building their own agentic framework.

1. Container IS the Sandbox

NanoClaw runs agents with permissionMode: 'bypassPermissions' (container/agent-runner/src/index.ts:415) — no "are you sure?" prompts, no tool-level access control. The container boundary handles all of it.

Lesson: Choose one strong isolation boundary rather than layering weak application-level checks. If your framework uses containers/VMs, give the agent full autonomy inside and focus security effort on what gets mounted and what credentials are injected.

Tradeoff: Container cold-start latency (~2-5s) vs. instant in-process execution. NanoClaw mitigates this with persistent containers (see #3).

2. File-Based IPC Over Network Protocols

The entire agent-to-host communication layer is JSON files written to a shared directory (/workspace/ipc/). No HTTP server, no WebSocket, no gRPC.

Container writes:  /workspace/ipc/messages/*.json  →  Host reads and routes
Container writes:  /workspace/ipc/tasks/*.json     →  Host reads and executes
Host writes:       /workspace/ipc/input/*.json     →  Container reads as follow-up
Host writes:       /workspace/ipc/input/_close      →  Container exits gracefully

Atomic write pattern (container/agent-runner/src/ipc-mcp-stdio.ts:23-35):

fs.writeFileSync(tempPath, JSON.stringify(data));
fs.renameSync(tempPath, filepath);  // Atomic on same filesystem

Lesson: Consider whether you actually need network IPC between agent and host. If the agent lifecycle is container-scoped, file IPC is simpler, more debuggable (ls and cat the directories), and works identically across Docker, Apple Container, and Podman.

Tradeoff: ~1s polling latency vs. instant push. Acceptable for chat-based interactions; may not suit real-time applications.

3. Persistent Multi-Turn Containers with Idle Timeout

Most agent frameworks spawn a container per request. NanoClaw keeps containers alive after responding, waiting for follow-up messages via the query loop (container/agent-runner/src/index.ts:582-615):

while (true) {
  await runQuery(prompt, sessionId, ...);
  const nextMessage = await waitForIpcMessage();  // Block until next message or _close
  if (nextMessage === null) break;
  prompt = nextMessage;
}

The container exits only when:

Lesson: Multi-turn conversations without cold starts. The SDK session remains warm, context window is preserved, and follow-up messages arrive in <1s instead of 2-5s container startup.

Tradeoff: Memory/CPU usage for idle containers. Managed by the idle timeout and MAX_CONCURRENT_CONTAINERS cap.

4. Script Pre-check for Scheduled Tasks

Before waking the LLM, a bash script runs and returns {wakeAgent: boolean, data: any} (container/agent-runner/src/index.ts:476-516):

#!/bin/bash
latest=$(curl -s https://api.github.com/repos/org/repo/releases/latest | jq -r .tag_name)
if [ "$latest" = "v1.0.0" ]; then
  echo '{"wakeAgent": false}'
else
  echo '{"wakeAgent": true, "data": {"newVersion": "'$latest'"}}'
fi

Lesson: A 30-second bash check costs nothing vs. a full Claude invocation. Use cheap pre-filters before expensive AI calls. The data field lets the script enrich the prompt with dynamic context.

Applicability: Any scheduled/polling agent system where most checks result in "nothing to do."

5. Context Accumulation (Non-Trigger Messages as Batch Context)

Messages that don't trigger the agent are stored but not processed. When a trigger arrives, the agent sees all accumulated context (src/index.ts:483-493):

User A: "The build is broken"           ← stored, no trigger
User B: "Yeah, the tests fail too"      ← stored, no trigger
User A: "@Andy can you help debug?"     ← trigger! Agent sees ALL THREE messages

Lesson: Don't discard non-actionable input — batch it for when action is needed. This gives the agent richer context without being invoked for every message.

Applicability: Any agent that monitors a stream of events but should only act on certain triggers.

6. GroupQueue Priority System

Tasks get priority over messages during queue drain (src/group-queue.ts:291-301) because:

Lesson: Prioritize work that's harder to recover from missing. In a queue with mixed work types, drain the ephemeral/time-sensitive items first.

7. Credential Proxy Over Credential Injection

Instead of ANTHROPIC_API_KEY as an env var (readable and exfiltrable by the agent), NanoClaw uses a gateway proxy (src/container-runner.ts:226-249):

Agent → ANTHROPIC_BASE_URL (points to local OneCLI gateway) → gateway injects real key → Anthropic API

The .env file is shadowed with /dev/null in container mounts (src/container-runner.ts:82-91).

Lesson: Defense in depth against credential theft by autonomous agents. Even if the agent tries to read environment variables or mounted files, it never sees real secrets.

Implementation cost: Requires running a credential proxy (OneCLI in NanoClaw's case). Worth it for any framework where agents have bash access.

8. Per-Entity Isolation at Every Layer

Not just filesystem isolation, but:

Lesson: If your framework supports multi-tenant or multi-context execution, isolate at every layer, not just the obvious ones. An agent that can read another group's session history is a privilege escalation even if filesystem isolation is perfect.

9. Internal Tags for Agent Reasoning

The <internal>...</internal> tag convention lets agents include reasoning in their output that gets stripped before delivery (src/index.ts:292):

const text = raw.replace(/<internal>[\s\S]*?<\/internal>/g, '').trim();

Lesson: Simple regex, zero overhead. Agents can "think out loud" while keeping responses clean. The internal content is still logged for debugging. Useful for any framework where agent output goes to end users.

10. Skills as Code Transformations, Not Runtime Plugins

Rather than a plugin API with loading, versioning, and compatibility concerns, NanoClaw uses git branch merges. A "skill" permanently transforms the codebase:

User runs /add-telegram
  → Claude Code reads .claude/skills/add-telegram/SKILL.md
  → Claude Code merges skill/add-telegram branch
  → Telegram support is now part of the codebase (not a plugin)

Lesson: Some "plugins" are better modeled as one-time code transforms than as runtime extensions. This eliminates plugin API versioning, dynamic loading bugs, and dependency conflicts. Each user's fork is a clean, customized installation.

Tradeoff: Harder to update (branch merges can conflict) vs. plugin version bumps. NanoClaw addresses this with /update-nanoclaw and /update-skills workflows.

11. Dual-Cursor Crash Recovery

Two cursors for message processing (src/index.ts:70-73, 121-136):

Recovery from the database (src/index.ts:121-136):

function getOrRecoverCursor(chatJid: string): string {
  const existing = lastAgentTimestamp[chatJid];
  if (existing) return existing;
  // Recover from last bot reply in DB
  return getLastBotMessageTimestamp(chatJid, ASSISTANT_NAME) || '';
}

Cursor rollback on error, with duplicate-prevention (src/index.ts:314-331):

if (hadError && !outputSentToUser) {
  lastAgentTimestamp[chatJid] = previousCursor;  // Safe to retry
}
// If output was already sent, DON'T rollback (would cause duplicates)

Lesson: For any system processing a stream of events:

  1. Separate "seen" from "processed" cursors
  2. Recover from the database, not just in-memory state
  3. Distinguish "error before output" (retry-safe) from "error after output" (can't retry)

12. Pre-Compact Conversation Archival

SDK hook archives full conversation transcripts before context compaction (container/agent-runner/src/index.ts:147-187):

hooks: {
  PreCompact: [{ hooks: [createPreCompactHook()] }],
}

Lesson: If your framework uses context compaction (summarizing old messages to free context window), hook into the compaction lifecycle to preserve the full transcript. Without this, conversation history is irreversibly summarized and detail is lost.

Summary Matrix

Pattern Complexity Impact When to Use
Container as sandbox Medium High Always, if using containers
File-based IPC Low Medium Container-scoped agents
Persistent containers Medium High Multi-turn conversations
Script pre-check Low Medium Scheduled/polling agents
Context accumulation Low High Stream-monitoring agents
Queue priority Low Medium Mixed work types
Credential proxy High High Agents with bash/network access
Per-entity isolation Medium High Multi-tenant systems
Internal tags Low Low User-facing agent output
Skills as transforms High Medium Extensible single-user tools
Dual-cursor recovery Medium High Stream processing with reliability needs
Pre-compact archival Low Medium Long-running conversations