Design Patterns & Decisions

Patterns

1. Two-Tier Adapter (Channel Dock / Channel Plugin)

Where: src/channels/dock.ts, src/channels/plugins/types.plugin.ts

Pattern: Every channel has two representations:

Dock: Lightweight metadata (capabilities, limits, config shape). Never imports heavy dependencies. Used for routing, config validation, and UI.
Plugin: Full implementation (connect, send, receive, setup flows). Can import anything.

Why: The gateway needs to reason about all channels (for routing, config UI) without loading every SDK. Docks are importable anywhere without side effects. Plugins are loaded lazily when a channel is activated.

Clever detail: buildDockFromPlugin() auto-generates a dock from a plugin's metadata, so extension channels only define the plugin and get the dock for free.

2. Factory-Based Tool Registration

Where: src/plugins/registry.ts, src/agents/pi-tools.ts

Pattern: Tools are registered as factories (functions that return tool instances), not as static objects.

api.registerTool(
  (ctx) => ctx.config?.channels?.discord?.enabled
    ? new DiscordActionTool(ctx.config)
    : null,  // Skip if Discord not configured
  { name: "discord_actions" }
);

Why: Tools need runtime context (workspace path, session key, config, channel capabilities). Factories defer instantiation until the context is available. Optional tools can return null to gracefully exclude themselves.

3. Event Subscription (Observer)

Where: src/agents/pi-embedded-subscribe.ts, src/agents/pi-embedded-subscribe.handlers.ts

Pattern: The Pi SDK emits events (message_start, tool_execution_start, etc.) and OpenClaw subscribes with a unified event handler that dispatches to specialized handlers.

session.subscribe(createEmbeddedPiSessionEventHandler(ctx));

Why: The agentic loop runs inside the SDK. OpenClaw can't (and shouldn't) modify it. Events decouple the loop from the UI/channel output layer. Different channels can react to the same events differently (e.g., streaming token-by-token for web, batching for Telegram).

4. Two-Level Hook System

Where: src/hooks/internal-hooks.ts (internal), src/plugins/hooks.ts (plugin)

Pattern: Two parallel hook systems:

	Internal Hooks	Plugin Hooks
Registration	`registerInternalHook(key, handler)`	`api.on(hookName, handler, { priority })`
Dispatch	`triggerInternalHook(event)`	`hookRunner.runModifyingHook(name, event, ctx, merger)`
Execution	Parallel, fire-and-forget	Sequential by priority, result merging
Use case	Core lifecycle events	Extensible behavior modification

Why: Core code needs simple, fast event dispatch (command lifecycle, etc.). Plugins need composable, priority-ordered hooks that can modify behavior (e.g., several plugins each prepending to the system prompt, merged in order).

Modifying hooks are notable: each handler returns partial data, and a merge function accumulates results. This lets multiple plugins contribute to the system prompt without conflicts:

hookRunner.runModifyingHook(
  "beforePromptBuild",
  event,
  ctx,
  mergeBeforePromptBuild  // Accumulates systemPrompt + prependContext
);

5. Registry + `requireActive()` Guard

Where: src/plugins/registry.ts

Pattern: The plugin registry is set globally via setActivePluginRegistry() and accessed via requireActivePluginRegistry(), which throws if called before initialization.

Why: Prevents code from accidentally accessing plugins before they're loaded. Makes the initialization order explicit and testable (tests can set/unset the registry). Avoids implicit global state — the "require" pattern makes the dependency visible.

6. Multi-Tier Routing with Fallback

Where: src/routing/resolve-route.ts

Pattern: Route resolution walks a priority chain:

Direct peer → 2. Thread parent → 3. Guild+roles → 4. Guild → 5. Team → 6. Account → 7. Channel → 8. Default

Why: Different channels have different granularity. Discord has guilds, roles, and threads. WhatsApp has groups and direct messages. A single routing strategy can't handle all of them. The fallback chain lets admins configure bindings at any level and routes will "cascade" to the most specific match.

7. Session Write Locking

Where: src/agents/session-write-lock.ts

Pattern: Before modifying session state, acquireSessionWriteLock() must be called. The lock is file-based (proper-lockfile) with configurable max hold time.

Why: Multiple channels might trigger the same agent session simultaneously (e.g., user sends two WhatsApp messages quickly). The lock prevents concurrent writes from corrupting session state. The max hold time prevents deadlocks from crashed processes.

8. Modular Prompt Construction

Where: src/agents/system-prompt.ts

Pattern: The system prompt is built from independent sections:

const sections = [
  buildIdentityLine(...),
  buildSkillsSection(...),
  buildMemorySection(...),
  buildUserIdentitySection(...),
  buildTimeSection(...),
  buildReplyTagsSection(...),
  buildMessagingSection(...),
  buildVoiceSection(...),
  buildWorkspaceSection(...),
  buildRuntimeSection(...),
];

Each section function returns string[] (or empty array to exclude itself). Sections decide their own inclusion based on mode (full / minimal / none), available tools, and config.

Why: Different contexts need different prompts. Subagents get minimal prompts. CLI sessions skip messaging sections. The section pattern makes additions/removals safe — each section is self-contained.

9. Streaming with Coalescing

Where: Channel dock configurations in src/channels/dock.ts

Pattern: For channels with rate limits (Telegram, IRC), streaming output is coalesced:

streaming: {
  blockStreamingCoalesceDefaults: { minChars: 1500, idleMs: 1000 }
}

Tokens accumulate until either minChars characters are buffered or idleMs milliseconds pass without new tokens, then the batch is sent as one message.

Why: Token-by-token streaming works for web UIs but would flood messaging platforms with hundreds of tiny messages. Coalescing balances responsiveness with API rate limits.

10. Config Hot-Reload

Where: src/gateway/config-reload.ts, src/gateway/server-reload-handlers.ts

Pattern: File watcher detects config changes → triggers selective reload:

Plugin registry refresh
Channel manager restart
Hook runner reinitialization
Model catalog refresh

In-flight requests are not interrupted. New requests use the new config.

Why: The daemon runs continuously. Requiring a restart for config changes is unacceptable for a multi-channel assistant that might be mid-conversation.

Notable Design Tradeoffs

SDK Encapsulation vs. Control

The agentic loop lives inside the Pi agent SDK. OpenClaw can't directly control iteration, tool dispatch ordering, or retry logic within the loop. The tradeoff:

Pro: Clean separation. The SDK handles the hard parts (streaming, tool dispatch, context management, compaction). OpenClaw focuses on everything around it.
Con: Debugging loop behavior requires understanding the SDK. Custom loop modifications (e.g., "stop after 5 tool calls") must be implemented through the SDK's extension points (event handlers, abort signals) rather than direct control.

Lightweight Docks vs. Plugin Duplication

Channel metadata exists in two places (dock and plugin). This is intentional duplication:

Pro: Fast import paths. The gateway can load all docks instantly without importing WhatsApp/Discord SDKs.
Con: Metadata can drift between dock and plugin if not kept in sync. Mitigated by buildDockFromPlugin().

File-Based Sessions vs. Database

Sessions are persisted as JSON files with file-based locking:

Pro: Zero infrastructure. No database to manage. Works on any filesystem.
Con: No querying across sessions. File locking is platform-dependent. Scaling to thousands of concurrent sessions is constrained by filesystem I/O.

Dynamic Plugin Loading vs. Static Bundling

Plugins are discovered and dynamically imported at runtime:

Pro: Extensibility without rebuilding. Users can drop a plugin folder and restart.
Con: Runtime errors from bad plugins. Type safety is limited to the plugin SDK interface. Import errors are caught and reported as diagnostics but don't prevent startup.

What's Clever

Tool Definitions as LLM Prompts

Every tool's description field is carefully written as a prompt to the LLM. These descriptions are not documentation for developers — they're instructions for the AI on when and how to use the tool. This is a form of "prompt engineering at the tool level."

Session Transcript Repair

src/agents/session-transcript-repair.ts and src/agents/session-file-repair.ts handle corrupted session files — detecting and fixing malformed JSON, orphaned tool_use blocks without matching tool_result entries, and other edge cases. This robustness matters because sessions persist across process crashes.

Steer During Execution

session.steer(text) lets users inject messages during an active agentic loop. The SDK includes the injected message in the next LLM call, allowing mid-execution course correction. This is particularly useful for long-running agent tasks.

Model Failover Chain

src/agents/model-fallback.ts implements automatic failover between models. If the primary model fails (rate limit, auth error, billing), the system tries the next model in the chain. This is transparent to the user and the agent's tool state is preserved.

Bootstrap File Convention

The agent looks for OPENCLAW.md (or similar bootstrap files) in the workspace root. This file is included in the system prompt, letting projects customize agent behavior per-repository without any configuration. It's a convention-over-configuration pattern borrowed from tools like .editorconfig.