CodeDocs Vault

15. Memory System — Deep Dive

This doc goes beyond the taxonomy and prompt shape of the memory system (covered in docs 6 and 8) to the exact mechanics: extraction prompts, trigger thresholds, fork agent shape, recall scoring, team scope, cache integration.

15.1 The two extraction prompts — /workspaces/src/services/extractMemories/prompts.ts

buildExtractAutoOnlyPrompt()prompts.ts:50-94

Used when only private memory is active. Verbatim:

You are now acting as the memory extraction subagent. Analyze the most recent ~${newMessageCount}
messages above and use them to update your persistent memory systems.

Available tools: File Read, Grep, Glob, read-only Bash (ls/find/cat/stat/wc/head/tail and
similar), and File Edit/File Write for paths inside the memory directory only. Bash rm is not
permitted. All other tools — MCP, Agent, write-capable Bash, etc — will be denied.

You have a limited turn budget. File Edit requires a prior File Read of the same file, so the
efficient strategy is: turn 1 — issue all File Read calls in parallel for every file you might
update; turn 2 — issue all File Write/File Edit calls in parallel. Do not interleave reads and
writes across multiple turns.

You MUST only use content from the last ~${newMessageCount} messages to update your persistent
memories. Do not waste any turns attempting to investigate or verify that content further —
no grepping source files, no reading code to confirm a pattern exists, no git commands.

Followed by:

buildExtractCombinedPrompt()prompts.ts:101-154

Used when team memory is enabled. Similar skeleton but with:

15.2 When extraction fires — /workspaces/src/services/extractMemories/extractMemories.ts:296-616

Extraction is a stop hook — it runs after the model produces a final response with no tool calls.

The gates

  1. Feature flag: tengu_passport_quail must be true (extractMemories.ts:536-542). If false, extraction never runs.
  2. Auto-memory enabled: isAutoMemoryEnabled() checks CLAUDE_CODE_DISABLE_AUTO_MEMORY env var + user settings (line 545).
  3. Local mode only: not run in remote sessions (line 550).
  4. Throttle: feature flag tengu_bramble_lintel (default 1) determines every-N-turns cadence (lines 377-385).

The mutual-exclusion check

hasMemoryWritesSince(messages, cursor) (lines 121-148) detects whether the main agent already wrote to memory files in the current turn. If it did, extraction skips — no point running a forked writer over the same content. The cursor still advances so the next extraction sees fresh delta.

Cursor-based dedup

lastMemoryMessageUuid  // tracked in AppState

countModelVisibleMessagesSince(cursor) counts only user/assistant messages after the cursor. After a successful extraction, the cursor advances to the last processed message (lines 354, 434). Tool-call counts aren't a direct trigger — token delta and turn cadence are the primary levers.

Trailing runs

If a new extraction call arrives while one is running, the new context is stashed in pendingContext (lines 557-563). After the current run finishes, a trailing extraction fires for the delta — its newMessageCount is computed relative to the already-advanced cursor, not the original window.

15.3 The extraction fork — /workspaces/src/utils/forkedAgent.ts

Extraction runs via runForkedAgent(), the same primitive used by /btw and fork subagents. The fork inherits:

Restricted tool policy — createAutoMemCanUseTool() at extractMemories.ts:171-222

The prompt pre-announces this policy so the model doesn't try denied tools.

Budgets

Lifecycle

Fire-and-forget from the stop hook (line 598). The shutdown path awaits via drainPendingExtraction() (lines 611-615) so in-flight extractions complete before the process exits.

15.4 Memory loading (recall) — /workspaces/src/memdir/findRelevantMemories.ts

The scan

scanMemoryFiles() reads the memory directory, parses the first 30 lines of each file (enough for frontmatter), and returns file headers sorted newest-first (by mtime), capped at 200 files max. MEMORY.md itself is excluded (already loaded separately in the prompt).

The selection call

A Sonnet sideQuery (not Haiku — precision matters here) is invoked with:

Query: ${currentUserMessage}

Available memories:
- [user] user_role.md (2026-03-14): deep Go expertise, new to React side of this repo
- [feedback] feedback_testing.md (2026-02-01): integration tests must hit a real database
- [project] project_merge_freeze.md (2026-03-05): merge freeze begins 2026-03-05 for mobile cut
- ...

Tools recently used: FileRead, Grep

Filtering

Budgets

Composition

Selected memory files are read in full and injected into the system prompt alongside MEMORY.md. This is the "dynamic memory block" that sits post-SYSTEM_PROMPT_DYNAMIC_BOUNDARY.

15.5 MEMORY.md — format and discipline

Schema

Pure markdown, no frontmatter. One line per entry:

- [Title](filename.md) — one-line hook
- [Another](topic.md) — hook

The parser just iterates lines; it doesn't validate structure strictly. The prompt (quoted in doc 7) tells the model to keep lines under ~150 chars and move detail to topic files.

Truncation — /workspaces/src/memdir/memdir.ts:57-103

The warning is visible to the model so it can clean up the index on the next extraction.

15.6 Individual memory file — schema

---
name: memory name
description: one-line hook used for relevance selection
type: user | feedback | project | reference
---
 
Body text. For `feedback` / `project`:
Lead with the rule.
 
**Why:** the reason given.
**How to apply:** when / where this kicks in.

Validation — /workspaces/src/memdir/memoryScan.ts:46-64

Body structure is not enforced — it's guidance in the prompt. The Why: / How to apply: format is consistently nudged but legacy files pre-dating the convention still work.

15.7 Team memory — /workspaces/src/memdir/teamMemPaths.ts + /workspaces/src/services/teamMemorySync/

Location

~/.claude/projects/<slug>/memory/team/ — a sibling of the private memory directory (teamMemPaths.ts:85). Each has its own MEMORY.md.

Gate

isTeamMemoryEnabled() (teamMemPaths.ts:73-78) — requires both auto-memory enabled and the tengu_herring_clock feature flag.

Prompt mode

buildCombinedMemoryPrompt() (teamMemPrompts.ts:22-100):

Security — path validation

validateTeamMemKey() and validateTeamMemWritePath() (teamMemPaths.ts:265-292) check for:

These matter because the team directory is synced to a shared location; a path-traversal write would affect other users.

Sync

/workspaces/src/services/teamMemorySync/ — not live during a session. Team memory changes are synced at session start/end via a watcher. Private memory is always local.

15.8 Session memory — the parallel track — /workspaces/src/services/SessionMemory/

Distinct from auto-memory. Same filesystem area (~/.claude/session-memory/<session-id>/…) but different purpose.

Goal

A running summary of this session that can survive compaction — so when compaction fires, the model has a durable record of what was done.

Structure

Triggers — sessionMemory.ts:134-181

const shouldExtract =
  (hasMetTokenThreshold && hasMetToolCallThreshold) ||
  (hasMetTokenThreshold && !hasToolCallsInLastTurn)

Prompt template

sessionMemory/prompts.ts:43-247 — template-based, customizable via ~/.claude/session-memory/config/prompt.md. The model is told to preserve structure (the sections above) and respect the token budgets.

Relationship to auto-memory

Both are extraction-driven; both use runForkedAgent(). The difference is persistence scope and file discipline.

15.9 Cache implications

Memory sits after the SYSTEM_PROMPT_DYNAMIC_BOUNDARY marker (constants/prompts.ts:495-573). That placement is deliberate:

loadMemoryPrompt() is called from a systemPromptSection('memory', …) registration (constants/prompts.ts:495) that is memoized for the session until /clear or /compact. A mid-session memory write does not immediately re-render the system prompt — the next turn uses the cached value. The new memory takes effect on the next /clear or /compact, or on a new session boot, unless the extraction explicitly invalidates the section cache (which it does not — extractions are always delayed-effect).

This is a subtle choice: immediate-effect memory writes would give faster behavior change but worse cache economy. The current design prefers cache economy because most memory writes are not urgent (they're summaries of just-completed work).

15.10 /clear and /compact effects

/clear

/compact

Neither command erases or mutates persistent memories. Memory is strictly additive (barring manual /memory management).

15.11 Telemetry and debugging

15.12 The mental model

  1. Memory is extracted, not maintained. You don't tell the agent "remember this." The stop-hook extraction reads what just happened and writes what's worth keeping.
  2. Memory is a claim, not a truth. Recall-side guidance tells the model to verify file paths / function names against current state before acting.
  3. Index vs. topic. MEMORY.md is a one-line-per-item catalog so the recall scorer can decide without loading every file; topic files carry the detail.
  4. Fork-based extraction means writing memory is cheap — it shares the parent's cache, runs in the background, and costs only the delta.
  5. Cache-safe placement of memory in the prompt means writes don't blow up prompt costs.
  6. Scope discipline (private vs. team, auto vs. session) keeps different kinds of knowledge in their right homes.

Key files:

Concern File
Extraction prompts services/extractMemories/prompts.ts:50-154
Extraction orchestration services/extractMemories/extractMemories.ts:296-616
Fork agent utils/forkedAgent.ts
Recall (relevance selection) memdir/findRelevantMemories.ts:39-75
Directory and truncation memdir/memdir.ts:57-103
Type taxonomy memdir/memoryTypes.ts:14, 113-178
Team memory scope memdir/teamMemPaths.ts:73-292
Session memory services/SessionMemory/sessionMemory.ts:134-181 + prompts.ts:43-247
Cache boundary constants/prompts.ts:114-115, 495, 573