Lessons for Agentic Framework Builders
Transferable patterns and design decisions from NanoClaw, distilled for anyone building their own agentic framework.
1. Container IS the Sandbox
NanoClaw runs agents with permissionMode: 'bypassPermissions' (container/agent-runner/src/index.ts:415) — no "are you sure?" prompts, no tool-level access control. The container boundary handles all of it.
Lesson: Choose one strong isolation boundary rather than layering weak application-level checks. If your framework uses containers/VMs, give the agent full autonomy inside and focus security effort on what gets mounted and what credentials are injected.
Tradeoff: Container cold-start latency (~2-5s) vs. instant in-process execution. NanoClaw mitigates this with persistent containers (see #3).
2. File-Based IPC Over Network Protocols
The entire agent-to-host communication layer is JSON files written to a shared directory (/workspace/ipc/). No HTTP server, no WebSocket, no gRPC.
Container writes: /workspace/ipc/messages/*.json → Host reads and routes
Container writes: /workspace/ipc/tasks/*.json → Host reads and executes
Host writes: /workspace/ipc/input/*.json → Container reads as follow-up
Host writes: /workspace/ipc/input/_close → Container exits gracefully
Atomic write pattern (container/agent-runner/src/ipc-mcp-stdio.ts:23-35):
fs.writeFileSync(tempPath, JSON.stringify(data));
fs.renameSync(tempPath, filepath); // Atomic on same filesystemLesson: Consider whether you actually need network IPC between agent and host. If the agent lifecycle is container-scoped, file IPC is simpler, more debuggable (ls and cat the directories), and works identically across Docker, Apple Container, and Podman.
Tradeoff: ~1s polling latency vs. instant push. Acceptable for chat-based interactions; may not suit real-time applications.
3. Persistent Multi-Turn Containers with Idle Timeout
Most agent frameworks spawn a container per request. NanoClaw keeps containers alive after responding, waiting for follow-up messages via the query loop (container/agent-runner/src/index.ts:582-615):
while (true) {
await runQuery(prompt, sessionId, ...);
const nextMessage = await waitForIpcMessage(); // Block until next message or _close
if (nextMessage === null) break;
prompt = nextMessage;
}The container exits only when:
- A
_closesentinel file appears (idle timeout, default 30 min) - The hard timeout fires
- It's preempted by a scheduled task
Lesson: Multi-turn conversations without cold starts. The SDK session remains warm, context window is preserved, and follow-up messages arrive in <1s instead of 2-5s container startup.
Tradeoff: Memory/CPU usage for idle containers. Managed by the idle timeout and MAX_CONCURRENT_CONTAINERS cap.
4. Script Pre-check for Scheduled Tasks
Before waking the LLM, a bash script runs and returns {wakeAgent: boolean, data: any} (container/agent-runner/src/index.ts:476-516):
#!/bin/bash
latest=$(curl -s https://api.github.com/repos/org/repo/releases/latest | jq -r .tag_name)
if [ "$latest" = "v1.0.0" ]; then
echo '{"wakeAgent": false}'
else
echo '{"wakeAgent": true, "data": {"newVersion": "'$latest'"}}'
fiLesson: A 30-second bash check costs nothing vs. a full Claude invocation. Use cheap pre-filters before expensive AI calls. The data field lets the script enrich the prompt with dynamic context.
Applicability: Any scheduled/polling agent system where most checks result in "nothing to do."
5. Context Accumulation (Non-Trigger Messages as Batch Context)
Messages that don't trigger the agent are stored but not processed. When a trigger arrives, the agent sees all accumulated context (src/index.ts:483-493):
User A: "The build is broken" ← stored, no trigger
User B: "Yeah, the tests fail too" ← stored, no trigger
User A: "@Andy can you help debug?" ← trigger! Agent sees ALL THREE messages
Lesson: Don't discard non-actionable input — batch it for when action is needed. This gives the agent richer context without being invoked for every message.
Applicability: Any agent that monitors a stream of events but should only act on certain triggers.
6. GroupQueue Priority System
Tasks get priority over messages during queue drain (src/group-queue.ts:291-301) because:
- Tasks can't be re-discovered from the database (they'd miss their schedule)
- Messages persist in SQLite and can be retried on the next poll
Lesson: Prioritize work that's harder to recover from missing. In a queue with mixed work types, drain the ephemeral/time-sensitive items first.
7. Credential Proxy Over Credential Injection
Instead of ANTHROPIC_API_KEY as an env var (readable and exfiltrable by the agent), NanoClaw uses a gateway proxy (src/container-runner.ts:226-249):
Agent → ANTHROPIC_BASE_URL (points to local OneCLI gateway) → gateway injects real key → Anthropic API
The .env file is shadowed with /dev/null in container mounts (src/container-runner.ts:82-91).
Lesson: Defense in depth against credential theft by autonomous agents. Even if the agent tries to read environment variables or mounted files, it never sees real secrets.
Implementation cost: Requires running a credential proxy (OneCLI in NanoClaw's case). Worth it for any framework where agents have bash access.
8. Per-Entity Isolation at Every Layer
Not just filesystem isolation, but:
- Separate IPC namespaces — prevents cross-group commands (
src/container-runner.ts:168-178) - Separate
.claude/directories — prevents cross-group session access (src/container-runner.ts:118-166) - Separate agent-runner source — allows per-group customization (
src/container-runner.ts:182-211) - Separate OneCLI agents — per-group credential scoping (
src/index.ts:81-98)
Lesson: If your framework supports multi-tenant or multi-context execution, isolate at every layer, not just the obvious ones. An agent that can read another group's session history is a privilege escalation even if filesystem isolation is perfect.
9. Internal Tags for Agent Reasoning
The <internal>...</internal> tag convention lets agents include reasoning in their output that gets stripped before delivery (src/index.ts:292):
const text = raw.replace(/<internal>[\s\S]*?<\/internal>/g, '').trim();Lesson: Simple regex, zero overhead. Agents can "think out loud" while keeping responses clean. The internal content is still logged for debugging. Useful for any framework where agent output goes to end users.
10. Skills as Code Transformations, Not Runtime Plugins
Rather than a plugin API with loading, versioning, and compatibility concerns, NanoClaw uses git branch merges. A "skill" permanently transforms the codebase:
User runs /add-telegram
→ Claude Code reads .claude/skills/add-telegram/SKILL.md
→ Claude Code merges skill/add-telegram branch
→ Telegram support is now part of the codebase (not a plugin)
Lesson: Some "plugins" are better modeled as one-time code transforms than as runtime extensions. This eliminates plugin API versioning, dynamic loading bugs, and dependency conflicts. Each user's fork is a clean, customized installation.
Tradeoff: Harder to update (branch merges can conflict) vs. plugin version bumps. NanoClaw addresses this with /update-nanoclaw and /update-skills workflows.
11. Dual-Cursor Crash Recovery
Two cursors for message processing (src/index.ts:70-73, 121-136):
- Global cursor (
lastTimestamp) — "I've polled messages up to here" - Per-group cursor (
lastAgentTimestamp[chatJid]) — "Agent has processed messages up to here for this group"
Recovery from the database (src/index.ts:121-136):
function getOrRecoverCursor(chatJid: string): string {
const existing = lastAgentTimestamp[chatJid];
if (existing) return existing;
// Recover from last bot reply in DB
return getLastBotMessageTimestamp(chatJid, ASSISTANT_NAME) || '';
}Cursor rollback on error, with duplicate-prevention (src/index.ts:314-331):
if (hadError && !outputSentToUser) {
lastAgentTimestamp[chatJid] = previousCursor; // Safe to retry
}
// If output was already sent, DON'T rollback (would cause duplicates)Lesson: For any system processing a stream of events:
- Separate "seen" from "processed" cursors
- Recover from the database, not just in-memory state
- Distinguish "error before output" (retry-safe) from "error after output" (can't retry)
12. Pre-Compact Conversation Archival
SDK hook archives full conversation transcripts before context compaction (container/agent-runner/src/index.ts:147-187):
hooks: {
PreCompact: [{ hooks: [createPreCompactHook()] }],
}Lesson: If your framework uses context compaction (summarizing old messages to free context window), hook into the compaction lifecycle to preserve the full transcript. Without this, conversation history is irreversibly summarized and detail is lost.
Summary Matrix
| Pattern | Complexity | Impact | When to Use |
|---|---|---|---|
| Container as sandbox | Medium | High | Always, if using containers |
| File-based IPC | Low | Medium | Container-scoped agents |
| Persistent containers | Medium | High | Multi-turn conversations |
| Script pre-check | Low | Medium | Scheduled/polling agents |
| Context accumulation | Low | High | Stream-monitoring agents |
| Queue priority | Low | Medium | Mixed work types |
| Credential proxy | High | High | Agents with bash/network access |
| Per-entity isolation | Medium | High | Multi-tenant systems |
| Internal tags | Low | Low | User-facing agent output |
| Skills as transforms | High | Medium | Extensible single-user tools |
| Dual-cursor recovery | Medium | High | Stream processing with reliability needs |
| Pre-compact archival | Low | Medium | Long-running conversations |