CodeDocs Vault

Context Management

How conversation history is stored, bounded, compacted, and repaired.

Overview

Every agent conversation has a context window — the total amount of text (measured in tokens) that the LLM can see at once. Context management is the system that ensures conversations stay within this window, persists history to disk, handles corruption, and compacts old history when space runs out.

                   Context Window (e.g., 200k tokens)
┌──────────────────────────────────────────────────────────┐
│ System Prompt     │ Bootstrap  │ Conversation History    │
│ (~5-15k tokens)   │ Files      │ (user + assistant +     │
│                   │ (~5-20k)   │  tool calls/results)    │
│                   │            │                         │
│ [reserved floor]  │ [capped]   │ [managed by compaction] │
└──────────────────────────────────────────────────────────┘

Session Persistence

Storage Format

Sessions are stored as JSONL files (one JSON object per line) at:

~/.openclaw/agents/<agentId>/sessions/<sessionId>.jsonl

Each line is a message: { role, content, ... } — user messages, assistant responses, tool calls, and tool results.

Session Store (src/config/sessions/store.ts)

A central index at ~/.openclaw/sessions.json tracks all sessions:

type SessionEntry = {
  sessionId: string;
  sessionFile: string;           // Path to .jsonl file
  updatedAt: number;             // Last modification (ms)
  contextTokens: number;         // Model's context window size
  totalTokens: number;           // Latest measured token count
  totalTokensFresh: boolean;     // True = from latest run
  compactionCount: number;       // Times compacted
  // ... delivery metadata (channel, peer, thread)
};

Caching: 45-second TTL with mtime-based invalidation.

Maintenance (configurable mode: warn or enforce):

Session Keys (src/config/sessions/session-key.ts)

Session keys encode the full routing context:

agent:{agentId}:{provider}:{kind}:{userId}
Component Purpose
agentId Which agent owns this session
provider Channel (discord, whatsapp, telegram, etc.)
kind dm/direct, channel/group
userId Peer identifier

Thread suffixes (:thread:123, :topic:456) are stripped for DM history limit lookups, ensuring consistent limits across threads.

Session Locking (src/agents/session-write-lock.ts)

File-based exclusive locks prevent concurrent writes:

session-file.jsonl.lock → { pid: 12345, createdAt: 1708000000000 }
Feature Detail
Acquisition timeout 10 seconds (default)
Max hold time 5 minutes (default)
Watchdog interval 60 seconds — releases locks held beyond max
Nested support Counter-based — inner acquires increment, release on final
Stale detection Checks if PID is alive, cleans orphaned locks
Signal handlers SIGINT, SIGTERM, SIGQUIT, SIGABRT release locks
Windows atomicity Temp-file + rename with 5-retry loop (50ms backoff)

History Limiting (src/agents/pi-embedded-runner/history.ts)

Before the agent runs, conversation history is trimmed by user turn count:

limitHistoryTurns(messages, limit)
// Counts user turns backwards from the end
// Returns all messages from the last N user turns onwards

Limits are resolved per-session from config:

Config Key Scope
channels.{provider}.dms.{userId}.historyLimit Per-user override
channels.{provider}.dmHistoryLimit DM default for a channel
channels.{provider}.historyLimit Channel/group default

Source: getDmHistoryLimitFromSessionKey() in src/agents/pi-embedded-runner/history.ts

Context Window Guards (src/agents/context-window-guard.ts)

Resolution priority for context window size:

  1. modelsConfig in openclaw.json (per-provider override)
  2. model.contextWindow from model discovery
  3. agentContextTokens cap (if set)
  4. Default: 200,000 tokens

Safety bounds:

Token Budgeting

Reserve Floor (src/agents/pi-settings.ts)

DEFAULT_PI_COMPACTION_RESERVE_TOKENS_FLOOR = 20,000

This is the minimum tokens reserved for system prompt + tools + bootstrap context. Compaction cannot reduce history below this floor.

Budget Allocation

Context Window (200k tokens)
├── Reserve Floor (20k) — system prompt, tools, bootstrap
├── History Budget (up to 50% of window = 100k)
│   └── Compaction triggers when exceeded
└── Working Space (remaining) — current turn + tool results

Compaction

When conversation history approaches the context window limit, compaction summarizes older turns to free token budget.

How It Works (src/agents/compaction.ts, src/agents/pi-embedded-runner/compact.ts)

1. Acquire session write lock (5 min max hold)
2. Repair session file if needed (fix malformed JSONL)
3. Prewarm file (read 4KB for OS page cache)
4. Load session via SessionManager
5. Sanitize transcript (repair tool pairing)
6. Apply history turn limits
7. Run compaction:
   a. Estimate token usage per message
   b. Split messages into chunks (proportional to token weight)
   c. Summarize each chunk via LLM
   d. Merge partial summaries if multi-part
   e. Replace old history with summary
8. Fire before_compaction / after_compaction hooks
9. Log diagnostics (pre/post message counts, token deltas)
10. Release lock

Safeguard Mode (src/agents/pi-extensions/compaction-safeguard.ts)

When agents.defaults.compaction.mode = "safeguard":

Max history tokens = contextWindow × maxHistoryShare × SAFETY_MARGIN
                   = 200,000     × 0.5              × 1.2
                   = 120,000 tokens

If history exceeds this budget, pruneHistoryForContextShare() iteratively removes the oldest message chunks until within budget, repairing tool pairing after each round.

Summarization Strategy

summarizeWithFallback()
├── Full summarization (all messages)
├── Partial (oversized messages compressed individually)
└── Generic fallback: "No prior history."

summarizeInStages() (for large contexts)
├── Split into N parts (min 4 messages per part)
├── Summarize each part independently
└── Merge partial summaries

Token estimation uses a 20% safety margin (SAFETY_MARGIN = 1.2) to account for inaccuracy.

Summary Enrichment

Compaction summaries include:

Timeout Protection

Parameter Value
Compaction timeout 5 minutes
Lock hold time timeout + 2 min grace
LLM summarization retries 3 attempts, 500ms–5s backoff

Lane Queueing

Compaction uses command lanes to prevent conflicts:

Transcript Repair

Session files can become corrupted (process crashes mid-write, disk errors). Multiple repair mechanisms handle this.

File-Level Repair (src/agents/session-file-repair.ts)

1. Read entire file as text
2. Split by newlines, parse each as JSON
3. Drop unparseable lines
4. Validate first entry is session header
5. Write repaired JSONL (backup original as .bak-{pid}-{timestamp})

Tool Pairing Repair (src/agents/session-transcript-repair.ts)

The Anthropic API rejects transcripts where tool_use blocks don't have matching tool_result blocks. Repair handles:

Issue Fix
Missing tool result Insert synthetic error: "[openclaw] missing tool result..."
Orphaned tool result (no matching call) Drop it
Duplicate tool results (same ID) Keep first, drop rest
Mis-ordered results (not adjacent to call) Move adjacent
Incomplete tool calls (missing id/name/input) Drop

Tool Result Security

stripToolResultDetails() removes the .details field from tool results before:

This prevents untrusted/verbose tool output from leaking into contexts where it shouldn't be.

Session Manager Initialization (src/agents/pi-embedded-runner/session-manager-init.ts)

Edge case: if a session file exists but has no assistant message yet, it's reset to empty. This prevents a SessionManager quirk where flushed=true would cause the initial user prompt to be dropped.

Bootstrap Context Injection (src/agents/bootstrap-files.ts)

At the start of each run, workspace context files are injected into the system prompt:

resolveBootstrapContextForRun({
  workspaceDir,
  config,
  sessionKey,
  sessionId,
}) → { bootstrapFiles, contextFiles }

Size Limits

Limit Default
Per-file max chars ~5,000
Total max chars (all files) ~20,000

Files exceeding limits are truncated with a note in the system prompt.

Bootstrap Hooks

Plugins can inject additional context via hooks, applied before session start.

Complete Context Lifecycle

1. SESSION CREATION
   └─ New SessionEntry in sessions.json
   └─ Empty .jsonl file created

2. RUN INITIALIZATION
   ├─ Acquire session write lock
   ├─ Repair file if corrupted
   ├─ Prewarm (4KB page cache read)
   ├─ Open SessionManager, load transcript
   ├─ Repair tool pairing
   └─ Apply history turn limits

3. PROMPT ASSEMBLY
   ├─ System prompt (modular sections, ~5-15k tokens)
   ├─ Bootstrap files (OPENCLAW.md, etc., ~5-20k tokens)
   ├─ Memory recall instructions
   └─ Conversation history (bounded by limits)

4. AGENT EXECUTION
   ├─ LLM processes within context window
   ├─ Tool calls add to history
   ├─ SessionManager flushes each turn to JSONL
   └─ Token usage tracked

5. CONTEXT OVERFLOW → COMPACTION
   ├─ Safeguard detects history > 50% of window
   ├─ Prune oldest message chunks
   ├─ Summarize via LLM (multi-stage if large)
   ├─ Replace old history with summary
   ├─ Fire compaction hooks
   └─ Continue with compressed context

6. SESSION CLOSE
   ├─ Release write lock
   ├─ Update session metadata (tokens, compaction count)
   └─ Session file persists for future runs

7. MAINTENANCE (periodic)
   ├─ Prune sessions > 30 days old
   ├─ Cap at 500 entries
   └─ Rotate oversized session store (>10MB)