Key Abstractions & Design Patterns

1. Channel Interface — Plugin Architecture

File: src/types.ts:83-94

export interface Channel {
  name: string;
  connect(): Promise<void>;
  sendMessage(jid: string, text: string): Promise<void>;
  isConnected(): boolean;
  ownsJid(jid: string): boolean;
  disconnect(): Promise<void>;
  setTyping?(jid: string, isTyping: boolean): Promise<void>;
  syncGroups?(force: boolean): Promise<void>;
}

Pattern: Self-registering Plugin Registry

// src/channels/registry.ts:16
const registry = new Map<string, ChannelFactory>();
 
export function registerChannel(name: string, factory: ChannelFactory): void {
  registry.set(name, factory);
}

Channels register themselves at import time. The barrel import in src/channels/index.ts triggers registration for all installed channels. The factory returns Channel | null — null when credentials are missing, allowing graceful degradation.

Why this matters for your framework: This is a clean, zero-config plugin system. Adding a new channel requires:

Implementing the Channel interface
Calling registerChannel() at module scope
Adding the import to the barrel file

No configuration file, no plugin manifest, no dynamic require. The factory's null-return pattern handles optional channels without try/catch or feature flags.

2. GroupQueue — Per-entity Concurrency Controller

File: src/group-queue.ts

The GroupQueue manages the fundamental constraint: only one container per group at a time, with a global cap.

Pattern: State Machine per Entity + Global Semaphore

interface GroupState {
  active: boolean;           // Container currently running
  idleWaiting: boolean;      // Container finished work, waiting for IPC
  isTaskContainer: boolean;  // Running a scheduled task (can't accept piped messages)
  runningTaskId: string | null;
  pendingMessages: boolean;  // Messages queued while container was active
  pendingTasks: QueuedTask[];
  process: ChildProcess | null;
  containerName: string | null;
  groupFolder: string | null;
  retryCount: number;
}

Key behaviors:

Messages that arrive while a container is active get piped via IPC (no new container)
Tasks that arrive while a container is active preempt it (write _close sentinel)
When a container finishes, drainGroup() checks for pending work (tasks first, then messages)
When a group has nothing pending, drainWaiting() unblocks other groups waiting for a concurrency slot

Why this matters: This is a sophisticated work scheduler that handles the real-world complexity of multi-group messaging — messages arrive continuously, containers take minutes to respond, and you need both fairness across groups and efficiency within a group.

3. MessageStream — Async Iterable for Multi-turn Queries

File: container/agent-runner/src/index.ts:67-97

class MessageStream {
  private queue: SDKUserMessage[] = [];
  private waiting: (() => void) | null = null;
  private done = false;
 
  push(text: string): void { /* add to queue, wake waiter */ }
  end(): void { /* signal completion */ }
 
  async *[Symbol.asyncIterator](): AsyncGenerator<SDKUserMessage> {
    while (true) {
      while (this.queue.length > 0) yield this.queue.shift()!;
      if (this.done) return;
      await new Promise<void>(r => { this.waiting = r; });
    }
  }
}

Pattern: Push-to-pull adapter via AsyncGenerator

The SDK's query() expects an async iterable of messages. MessageStream bridges the gap between push-based IPC (files appearing in a directory) and pull-based SDK consumption.

Why it exists: Without this, the SDK would see a single prompt and enter isSingleUserTurn mode, which prevents agent teams subagents from running to completion. The open stream tells the SDK "more input may come" even when the queue is empty.

4. File-based IPC — Communication Without Networking

Files: container/agent-runner/src/ipc-mcp-stdio.ts, src/ipc.ts

Container (writes)                    Host (reads)
─────────────────                     ───────────────
/workspace/ipc/messages/*.json  ──►   IPC watcher → channel.sendMessage()
/workspace/ipc/tasks/*.json     ──►   IPC watcher → processTaskIpc()
/workspace/ipc/input/*.json     ◄──   GroupQueue.sendMessage()
/workspace/ipc/input/_close     ◄──   GroupQueue.closeStdin()

Pattern: File system as message bus with atomic writes

// container/agent-runner/src/ipc-mcp-stdio.ts:23-35
function writeIpcFile(dir: string, data: object): string {
  const filename = `${Date.now()}-${Math.random().toString(36).slice(2, 8)}.json`;
  const filepath = path.join(dir, filename);
  const tempPath = `${filepath}.tmp`;
  fs.writeFileSync(tempPath, JSON.stringify(data, null, 2));
  fs.renameSync(tempPath, filepath);  // Atomic on same filesystem
  return filename;
}

The write-then-rename pattern ensures the reader never sees partial files. The timestamp-prefixed filenames maintain ordering.

Why file-based IPC over sockets/HTTP:

No networking inside the container (simpler, more secure)
Debuggable (you can ls and cat the IPC directories)
Works with any container runtime (Docker, Apple Container, Podman)
Atomic rename is sufficient for the throughput requirements

5. Sentinel Marker Protocol — Structured Output in an Unstructured Stream

Files: container/agent-runner/src/index.ts:109-116, src/container-runner.ts:34-35

const OUTPUT_START_MARKER = '---NANOCLAW_OUTPUT_START---';
const OUTPUT_END_MARKER = '---NANOCLAW_OUTPUT_END---';

Pattern: In-band signaling with sentinel delimiters

The container's stdout is shared between the SDK's debug output, the agent's tool calls, and NanoClaw's structured results. Sentinel markers delimit the structured JSON:

[SDK debug output...]
---NANOCLAW_OUTPUT_START---
{"status":"success","result":"Here's the weather...","newSessionId":"abc"}
---NANOCLAW_OUTPUT_END---
[More SDK output...]
---NANOCLAW_OUTPUT_START---
{"status":"success","result":null,"newSessionId":"abc"}
---NANOCLAW_OUTPUT_END---

The host's incremental parser (src/container-runner.ts:369-396) handles partial reads — it buffers until it finds a complete START/END pair.

6. Mount Security — Defense in Depth for Volume Mounts

File: src/mount-security.ts

Pattern: External allowlist with layered validation

Validation layers:
1. Allowlist existence check — no allowlist = all additional mounts blocked
2. Container path validation — no .., no absolute, no colons
3. Blocked pattern check — .ssh, .gnupg, .env, credentials, etc.
4. Allowed root check — path must be under an explicitly allowed directory
5. Read-write permission check — per-root and per-group restrictions
6. Symlink resolution — realpath before comparison (prevents symlink bypass)

The allowlist lives at ~/.config/nanoclaw/mount-allowlist.json — outside the project root and outside any container mount. This makes it tamper-proof from agent code.

// src/mount-security.ts:23-41
const DEFAULT_BLOCKED_PATTERNS = [
  '.ssh', '.gnupg', '.gpg', '.aws', '.azure', '.gcloud',
  '.kube', '.docker', 'credentials', '.env', '.netrc',
  '.npmrc', '.pypirc', 'id_rsa', 'id_ed25519', 'private_key', '.secret',
];

7. Credential Proxy — Secrets Without Secret Access

Pattern: Gateway-based credential injection

Containers never receive API keys directly. Instead:

Container → HTTPS request to api.anthropic.com
    │
    ▼ (ANTHROPIC_BASE_URL points to OneCLI gateway on host)

OneCLI gateway intercepts → injects real API key → forwards to Anthropic

The .env file is shadowed with /dev/null in container mounts (src/container-runner.ts:82-91), so even the main group's read-only project mount doesn't expose secrets.

8. Pre-compact Hook — Conversation Archival

File: container/agent-runner/src/index.ts:147-187

Pattern: SDK lifecycle hook for data preservation

function createPreCompactHook(assistantName?: string): HookCallback {
  return async (input) => {
    const transcriptPath = preCompact.transcript_path;
    // Read full transcript, parse into user/assistant messages
    // Archive to /workspace/group/conversations/{date}-{summary}.md
    return {};
  };
}

Before the SDK compacts context (summarizes to free up context window), this hook archives the full conversation as a Markdown file. This preserves complete conversation history that would otherwise be lost to compaction.

9. Skill System — Git Branches as Feature Flags

Pattern: Code as configuration via branch merging

Skills are not plugins loaded at runtime. They are git branches that modify the codebase:

.claude/skills/add-telegram/SKILL.md    ← Instructions for Claude Code
skill/add-telegram (git branch)          ← Actual code changes

User runs /add-telegram
  → Claude Code reads SKILL.md
  → Claude Code merges skill/add-telegram branch
  → Code is now part of the project

Four skill types serve different purposes:

Feature skills — git branch merges that add channels/integrations
Utility skills — self-contained code alongside SKILL.md instructions
Operational skills — pure instruction files (setup, debug workflows)
Container skills — loaded inside agent containers at runtime

Why this matters: Skills transform the codebase rather than extending it at runtime. This means no dynamic loading, no plugin API versioning, no dependency conflicts. Each user's fork is a clean, customized installation.

10. Cursor Recovery — Crash Resilience

Pattern: Dual-cursor with DB-backed recovery

Global cursor (lastTimestamp):
  "I've polled all messages up to this timestamp"
  → Advances in the message loop
  → If we crash here, some messages may be re-polled (idempotent)

Per-group cursor (lastAgentTimestamp[chatJid]):
  "Agent has processed messages up to this timestamp for this group"
  → Advances when messages are sent to agent
  → If we crash here, recovery checks getLastBotMessageTimestamp()

The recovery function (src/index.ts:121-136) reconstructs the per-group cursor from the last bot message in the database. This handles:

New group with no cursor history
Corrupted state after crash
Missing cursor after database migration

Design Pattern Summary

Pattern	Where	Why
Self-registering plugins	Channel registry	Zero-config channel addition
State machine + semaphore	GroupQueue	Per-group serialization with global concurrency
Push-to-pull adapter	MessageStream	Bridge IPC push events to SDK pull interface
File system as message bus	IPC layer	No networking, debuggable, container-runtime agnostic
Sentinel-delimited protocol	stdout parsing	Structured data in unstructured stream
External allowlist	Mount security	Tamper-proof from agent code
Gateway credential injection	OneCLI proxy	Secrets without secret access
Lifecycle hooks	Pre-compact archival	Preserve data before SDK discards it
Branch-as-feature	Skill system	Clean codebases, no runtime plugin complexity
Dual cursor with DB recovery	Message processing	Crash resilience with exactly-once semantics