Context Management

How conversation history is stored, bounded, compacted, and repaired.

Overview

Every agent conversation has a context window — the total amount of text (measured in tokens) that the LLM can see at once. Context management is the system that ensures conversations stay within this window, persists history to disk, handles corruption, and compacts old history when space runs out.

                   Context Window (e.g., 200k tokens)
┌──────────────────────────────────────────────────────────┐
│ System Prompt     │ Bootstrap  │ Conversation History    │
│ (~5-15k tokens)   │ Files      │ (user + assistant +     │
│                   │ (~5-20k)   │  tool calls/results)    │
│                   │            │                         │
│ [reserved floor]  │ [capped]   │ [managed by compaction] │
└──────────────────────────────────────────────────────────┘

Session Persistence

Storage Format

Sessions are stored as JSONL files (one JSON object per line) at:

~/.openclaw/agents/<agentId>/sessions/<sessionId>.jsonl

Each line is a message: { role, content, ... } — user messages, assistant responses, tool calls, and tool results.

Session Store (`src/config/sessions/store.ts`)

A central index at ~/.openclaw/sessions.json tracks all sessions:

type SessionEntry = {
  sessionId: string;
  sessionFile: string;           // Path to .jsonl file
  updatedAt: number;             // Last modification (ms)
  contextTokens: number;         // Model's context window size
  totalTokens: number;           // Latest measured token count
  totalTokensFresh: boolean;     // True = from latest run
  compactionCount: number;       // Times compacted
  // ... delivery metadata (channel, peer, thread)
};

Caching: 45-second TTL with mtime-based invalidation.

Maintenance (configurable mode: warn or enforce):

Prune entries older than 30 days
Cap at 500 entries (remove oldest by updatedAt)
Rotate to .bak.{timestamp} when file exceeds 10MB (keep 3 backups)

Session Keys (`src/config/sessions/session-key.ts`)

Session keys encode the full routing context:

agent:{agentId}:{provider}:{kind}:{userId}

Component	Purpose
`agentId`	Which agent owns this session
`provider`	Channel (discord, whatsapp, telegram, etc.)
`kind`	`dm`/`direct`, `channel`/`group`
`userId`	Peer identifier

Thread suffixes (:thread:123, :topic:456) are stripped for DM history limit lookups, ensuring consistent limits across threads.

Session Locking (`src/agents/session-write-lock.ts`)

File-based exclusive locks prevent concurrent writes:

session-file.jsonl.lock → { pid: 12345, createdAt: 1708000000000 }

Feature	Detail
Acquisition timeout	10 seconds (default)
Max hold time	5 minutes (default)
Watchdog interval	60 seconds — releases locks held beyond max
Nested support	Counter-based — inner acquires increment, release on final
Stale detection	Checks if PID is alive, cleans orphaned locks
Signal handlers	SIGINT, SIGTERM, SIGQUIT, SIGABRT release locks
Windows atomicity	Temp-file + rename with 5-retry loop (50ms backoff)

History Limiting (`src/agents/pi-embedded-runner/history.ts`)

Before the agent runs, conversation history is trimmed by user turn count:

limitHistoryTurns(messages, limit)
// Counts user turns backwards from the end
// Returns all messages from the last N user turns onwards

Limits are resolved per-session from config:

Config Key	Scope
`channels.{provider}.dms.{userId}.historyLimit`	Per-user override
`channels.{provider}.dmHistoryLimit`	DM default for a channel
`channels.{provider}.historyLimit`	Channel/group default

Source: getDmHistoryLimitFromSessionKey() in src/agents/pi-embedded-runner/history.ts

Context Window Guards (`src/agents/context-window-guard.ts`)

Resolution priority for context window size:

modelsConfig in openclaw.json (per-provider override)
model.contextWindow from model discovery
agentContextTokens cap (if set)
Default: 200,000 tokens

Safety bounds:

Hard minimum: 16,000 tokens — below this, agent refuses to start
Warning threshold: 32,000 tokens — logs warning

Token Budgeting

Reserve Floor (`src/agents/pi-settings.ts`)

DEFAULT_PI_COMPACTION_RESERVE_TOKENS_FLOOR = 20,000

This is the minimum tokens reserved for system prompt + tools + bootstrap context. Compaction cannot reduce history below this floor.

Budget Allocation

Context Window (200k tokens)
├── Reserve Floor (20k) — system prompt, tools, bootstrap
├── History Budget (up to 50% of window = 100k)
│   └── Compaction triggers when exceeded
└── Working Space (remaining) — current turn + tool results

Compaction

When conversation history approaches the context window limit, compaction summarizes older turns to free token budget.

How It Works (`src/agents/compaction.ts`, `src/agents/pi-embedded-runner/compact.ts`)

1. Acquire session write lock (5 min max hold)
2. Repair session file if needed (fix malformed JSONL)
3. Prewarm file (read 4KB for OS page cache)
4. Load session via SessionManager
5. Sanitize transcript (repair tool pairing)
6. Apply history turn limits
7. Run compaction:
   a. Estimate token usage per message
   b. Split messages into chunks (proportional to token weight)
   c. Summarize each chunk via LLM
   d. Merge partial summaries if multi-part
   e. Replace old history with summary
8. Fire before_compaction / after_compaction hooks
9. Log diagnostics (pre/post message counts, token deltas)
10. Release lock

Safeguard Mode (`src/agents/pi-extensions/compaction-safeguard.ts`)

When agents.defaults.compaction.mode = "safeguard":

Max history tokens = contextWindow × maxHistoryShare × SAFETY_MARGIN
                   = 200,000     × 0.5              × 1.2
                   = 120,000 tokens

If history exceeds this budget, pruneHistoryForContextShare() iteratively removes the oldest message chunks until within budget, repairing tool pairing after each round.

Summarization Strategy

summarizeWithFallback()
├── Full summarization (all messages)
├── Partial (oversized messages compressed individually)
└── Generic fallback: "No prior history."

summarizeInStages() (for large contexts)
├── Split into N parts (min 4 messages per part)
├── Summarize each part independently
└── Merge partial summaries

Token estimation uses a 20% safety margin (SAFETY_MARGIN = 1.2) to account for inaccuracy.

Summary Enrichment

Compaction summaries include:

Tool failures: Up to 8 most recent, with truncated error messages
File operations: Which files were read vs. modified
Oversized message notes: Flagged for context
Split turn annotations: Prefix context for continuity

Timeout Protection

Parameter	Value
Compaction timeout	5 minutes
Lock hold time	timeout + 2 min grace
LLM summarization retries	3 attempts, 500ms–5s backoff

Lane Queueing

Compaction uses command lanes to prevent conflicts:

Session lane (session:{key}): No concurrent runs on same session
Global lane: Serializes compaction across all sessions

Transcript Repair

Session files can become corrupted (process crashes mid-write, disk errors). Multiple repair mechanisms handle this.

File-Level Repair (`src/agents/session-file-repair.ts`)

1. Read entire file as text
2. Split by newlines, parse each as JSON
3. Drop unparseable lines
4. Validate first entry is session header
5. Write repaired JSONL (backup original as .bak-{pid}-{timestamp})

Tool Pairing Repair (`src/agents/session-transcript-repair.ts`)

The Anthropic API rejects transcripts where tool_use blocks don't have matching tool_result blocks. Repair handles:

Issue	Fix
Missing tool result	Insert synthetic error: `"[openclaw] missing tool result..."`
Orphaned tool result (no matching call)	Drop it
Duplicate tool results (same ID)	Keep first, drop rest
Mis-ordered results (not adjacent to call)	Move adjacent
Incomplete tool calls (missing id/name/input)	Drop

Tool Result Security

stripToolResultDetails() removes the .details field from tool results before:

Token estimation
LLM summarization
Prompt construction

This prevents untrusted/verbose tool output from leaking into contexts where it shouldn't be.

Session Manager Initialization (`src/agents/pi-embedded-runner/session-manager-init.ts`)

Edge case: if a session file exists but has no assistant message yet, it's reset to empty. This prevents a SessionManager quirk where flushed=true would cause the initial user prompt to be dropped.

Bootstrap Context Injection (`src/agents/bootstrap-files.ts`)

At the start of each run, workspace context files are injected into the system prompt:

resolveBootstrapContextForRun({
  workspaceDir,
  config,
  sessionKey,
  sessionId,
}) → { bootstrapFiles, contextFiles }

Size Limits

Limit	Default
Per-file max chars	~5,000
Total max chars (all files)	~20,000

Files exceeding limits are truncated with a note in the system prompt.

Bootstrap Hooks

Plugins can inject additional context via hooks, applied before session start.

Complete Context Lifecycle

1. SESSION CREATION
   └─ New SessionEntry in sessions.json
   └─ Empty .jsonl file created

2. RUN INITIALIZATION
   ├─ Acquire session write lock
   ├─ Repair file if corrupted
   ├─ Prewarm (4KB page cache read)
   ├─ Open SessionManager, load transcript
   ├─ Repair tool pairing
   └─ Apply history turn limits

3. PROMPT ASSEMBLY
   ├─ System prompt (modular sections, ~5-15k tokens)
   ├─ Bootstrap files (OPENCLAW.md, etc., ~5-20k tokens)
   ├─ Memory recall instructions
   └─ Conversation history (bounded by limits)

4. AGENT EXECUTION
   ├─ LLM processes within context window
   ├─ Tool calls add to history
   ├─ SessionManager flushes each turn to JSONL
   └─ Token usage tracked

5. CONTEXT OVERFLOW → COMPACTION
   ├─ Safeguard detects history > 50% of window
   ├─ Prune oldest message chunks
   ├─ Summarize via LLM (multi-stage if large)
   ├─ Replace old history with summary
   ├─ Fire compaction hooks
   └─ Continue with compressed context

6. SESSION CLOSE
   ├─ Release write lock
   ├─ Update session metadata (tokens, compaction count)
   └─ Session file persists for future runs

7. MAINTENANCE (periodic)
   ├─ Prune sessions > 30 days old
   ├─ Cap at 500 entries
   └─ Rotate oversized session store (>10MB)