CodeDocs Vault

07 - Key Files Reference

File Map: Responsibility Index

Tier 1: Essential (Must Read to Understand the System)

File Lines Responsibility
agent/core/agent_loop.py 1198 The heart. Think-act cycle, streaming LLM calls, tool execution, approval flow, error recovery, retry logic
agent/main.py 1359 CLI entry point. Interactive REPL, headless mode, event rendering, approval prompts, slash commands, signal handling
agent/core/tools.py 376 Tool system. ToolSpec/ToolRouter, built-in + MCP dispatch, schema generation, introspection-based handler calling
agent/core/session.py 306 Session state. Context manager, tool router, cancellation, event dual-logging, model switching, auto-save/upload
agent/context_manager/manager.py 330 Context management. System prompt loading, message history, compaction (LLM summarization), dangling tool call repair
agent/prompts/system_prompt_v3.yaml 165 Active system prompt. Agent persona, literature-first approach, anti-hallucination rules, autonomous mode directives
backend/routes/agent.py 504 API surface. All REST + SSE endpoints, model config, SSE streaming, event generator
backend/session_manager.py 457 Multi-session orchestration. Session lifecycle, EventBroadcaster fan-out, capacity management
frontend/src/hooks/useAgentChat.ts 743 Frontend orchestrator. Per-session chat state, side-channel events, research tracking, reconnection
frontend/src/lib/sse-chat-transport.ts 397 SSE transport. Custom ChatTransport bridging backend events to Vercel AI SDK

Tier 2: Important (Understand Key Features)

File Lines Responsibility
agent/tools/research_tool.py 460 Research sub-agent. Independent LLM context, cheaper model, read-only tools, 60-iteration cap
agent/tools/sandbox_client.py 1054 Sandbox infrastructure. HF Space creation, embedded FastAPI server, embedded Dockerfile, HTTP client with retry
agent/tools/sandbox_tool.py 292 Sandbox tools. Exposes sandbox_create/bash/read/write/edit as agent tools, auto-creation on first use
agent/tools/jobs_tool.py 1095 Training jobs. Python/Docker mode, log streaming, job tracking, hardware selection, scheduled jobs
agent/tools/papers_tool.py 1295 Paper research. 11 operations, HF + Semantic Scholar + arXiv APIs, citation graphs, full paper reading
agent/tools/docs_tools.py 979 Documentation. Whoosh search over 37 doc endpoints, OpenAPI spec indexing, doc page fetching
agent/core/doom_loop.py 135 Loop detection. Identical consecutive + repeating sequence detection, corrective prompt injection
agent/core/llm_params.py 77 LLM routing. Direct API vs HF Router, token resolution, reasoning effort, billing headers
frontend/src/store/agentStore.ts 469 UI state. Per-session state map, active session mirroring, research agents, panels, plans
frontend/src/components/Chat/ToolCallGroup.tsx 1117 Tool approval UI. Batch/individual approval, research visualization, script preview, hardware pricing

Tier 3: Supporting (Understand Specific Subsystems)

File Lines Responsibility
agent/tools/local_tools.py 426 Local-mode replacement for sandbox tools (subprocess, pathlib)
agent/tools/dataset_tools.py 439 Dataset inspection with parallel fetching, chat format analysis
agent/tools/hf_repo_git_tool.py 664 14 git operations on HF repos (branches, tags, PRs)
agent/tools/hf_repo_files_tool.py 323 File CRUD on HF repos (list, read, upload, delete)
agent/tools/github_find_examples.py 460 Fuzzy example discovery in GitHub repos
agent/tools/github_read_file.py 302 GitHub file reading with notebook conversion
agent/tools/github_list_repos.py 287 GitHub org/user repo listing
agent/tools/edit_utils.py 268 4-pass fuzzy matching for code edits
agent/tools/plan_tool.py 131 Todo list with status tracking
agent/tools/utilities.py 142 Job formatting helpers (tables, details)
agent/config.py 99 Pydantic config with env var substitution
agent/core/hf_router_catalog.py 129 HF model catalog cache, fuzzy suggestions
agent/core/session_uploader.py 202 Detached subprocess for HF dataset upload
agent/utils/terminal_display.py 516 CLI rendering: Rich markdown, typewriter, sub-agent live display
agent/utils/particle_logo.py 228 Particle animation (spring physics, braille rendering)
agent/utils/braille.py 120 Braille pixel canvas (2x4 resolution per cell)
agent/utils/crt_boot.py 113 CRT boot animation (glitch, noise, scanlines)
agent/utils/boot_timing.py 16 Shared math (settle_curve, color interpolation)
agent/utils/reliability_checks.py 14 Training script save pattern check
backend/main.py 82 FastAPI app setup, CORS, routers, static files
backend/models.py 105 Pydantic request/response models
backend/routes/auth.py 188 HuggingFace OAuth 2.0 flow
backend/dependencies.py 143 Auth middleware, token validation, dev mode
frontend/src/components/CodePanel/CodePanel.tsx 585 Script preview, editing, output display, plans
frontend/src/components/WelcomeScreen/WelcomeScreen.tsx 458 3-step onboarding checklist
frontend/src/components/Layout/AppLayout.tsx 436 Main layout with sidebar + resizable panel
frontend/src/lib/convert-llm-messages.ts 139 Backend -> AI SDK message format conversion
frontend/src/store/sessionStore.ts 96 Session list persistence
frontend/src/store/layoutStore.ts 41 Layout preferences

Configuration Files

File Purpose
configs/main_agent_config.json Default agent config (model, MCP servers, session settings)
agent/prompts/system_prompt.yaml V1 system prompt (historical)
agent/prompts/system_prompt_v2.yaml V2 system prompt (historical)
agent/prompts/system_prompt_v3.yaml V3 system prompt (ACTIVE)
pyproject.toml Python project config, dependencies, entry point
frontend/package.json Frontend dependencies
frontend/vite.config.ts Vite build config, proxy, aliases
frontend/tsconfig.json TypeScript config
Dockerfile Multi-stage Docker build for HF Spaces
backend/start.sh Production entrypoint script
.devcontainer/devcontainer.json Dev container (Claude Code sandbox)
.gitignore Git ignore patterns

Important Prompts and Instructions

System Prompts

File Status Key Innovation
agent/prompts/system_prompt_v3.yaml ACTIVE Literature-first research mandate, explicit failure mode catalog, scope-change prohibition, autonomous mode directives
agent/prompts/system_prompt_v2.yaml Historical Three-phase workflow (Research -> Plan -> Implement), training job checklists, verification checklist
agent/prompts/system_prompt.yaml Historical Original persona, 8 worked examples

Research Sub-Agent Prompt

Compaction Prompt

Doom Loop Corrective Prompts

Title Generation Prompt


Design Patterns Summary

Pattern Where Purpose
Queue-based producer-consumer agent_loop.py, session_manager.py Decouples UI from agent logic
Event broadcasting (fan-out) session_manager.py:41 One source queue -> N SSE subscribers
Strategy pattern tools.py Built-in handlers vs MCP dispatch via uniform interface
Introspection-based dispatch tools.py:253 inspect.signature() to detect session/tool_call_id params
Async context manager tools.py:211 ToolRouter lifecycle (MCP init/cleanup)
Fire-and-forget subprocess session_uploader.py:248 Detached upload process that survives agent exit
Read-before-write guard sandbox_client.py:818, local_tools.py:386 Prevents blind file overwrites
Fuzzy matching cascade edit_utils.py:35 4-pass matching (exact -> trim -> normalize)
Spring-damping physics particle_logo.py:36 Particle animation convergence
Per-session state with active mirroring agentStore.ts:265 Session-specific state mirrored to flat fields for active session
Custom transport bridge sse-chat-transport.ts Backend SSE -> Vercel AI SDK UIMessageChunk
Side-channel callbacks useAgentChat.ts:44 Non-chat events (research, panels) processed alongside chat

Notable Design Tradeoffs

In-Memory Sessions (No Database)

Sessions live only in server memory. If the server restarts, sessions are lost. Session trajectories are uploaded to HF datasets as fire-and-forget, but there's no session resume from the server side. The frontend persists messages to localStorage as a partial mitigation.

Tradeoff: Simplicity and low latency vs. durability. Acceptable for a tool where sessions are typically short-lived.

Embedded Sandbox Server

The sandbox server code (FastAPI app + Dockerfile) is stored as a string literal inside sandbox_client.py and uploaded to the HF Space on creation. This means the sandbox server code can't be independently versioned or tested.

Tradeoff: Self-contained deployment (no external dependencies) vs. maintainability.

Downgraded Research Model

The research sub-agent uses a cheaper model (claude-sonnet-4-6 vs claude-opus-4-6). This saves cost on what can be many LLM calls per research session.

Tradeoff: Cost savings vs. research quality. Mitigated by the research tool having its own focused system prompt and access to the same information tools.

No Parallel Tool Execution for Approval Tools

Tools requiring approval are batched and presented to the user, but execution of approved tools happens sequentially within the approval handler. Non-approval tools execute in parallel.

Tradeoff: Simpler approval UX (user sees all pending tools at once) vs. potential latency when multiple approved tools could run concurrently.

Doom Loop Detection Heuristics

The detection uses MD5 hashing of argument strings and fixed thresholds (3 consecutive, sequences of 2-5). This can miss subtle loops and may trigger on legitimate repeated calls.

Tradeoff: Simple implementation that catches common cases vs. sophisticated analysis that might be more accurate but harder to maintain.