CodeDocs Vault

02 - Architecture

High-Level Architecture

                         +-----------------------+
                         |     User Interfaces   |
                         +-----------+-----------+
                                     |
                    +----------------+----------------+
                    |                                 |
            +-------v-------+               +---------v--------+
            |   CLI (Rich)  |               |  Web UI (React)  |
            |  agent/main.py|               |  frontend/src/   |
            +-------+-------+               +---------+--------+
                    |                                 |
                    |  submission_queue                | POST /api/chat/{id}
                    |  event_queue                     | SSE events
                    |                                 |
            +-------v---------------------------------v--------+
            |                  Agent Core                       |
            |                                                   |
            |  +-------------+  +---------------+  +----------+ |
            |  | agent_loop  |  | ContextManager|  | Session  | |
            |  | (think/act) |  | (history,     |  | (state,  | |
            |  |             |  |  compaction,   |  |  events, | |
            |  |             |  |  system prompt)|  |  cancel) | |
            |  +------+------+  +---------------+  +----------+ |
            |         |                                         |
            |  +------v------+  +---------------+               |
            |  | ToolRouter  |  | DoomLoop      |               |
            |  | (dispatch)  |  | (detection)   |               |
            |  +------+------+  +---------------+               |
            +---------|----------------------------------------+
                      |
         +------------+------------+
         |                         |
   +-----v------+          +------v------+
   | Built-in   |          | MCP Tools   |
   | Tools (17) |          | (HF MCP)    |
   +-----+------+          +------+------+
         |                         |
   +-----v-------------------------v-----+
   |         External Services           |
   |                                     |
   |  HuggingFace Hub    GitHub API      |
   |  HF Inference       Semantic Scholar|
   |  HF Datasets Server arXiv/ar5iv    |
   |  HF Spaces (sandbox)               |
   +-------------------------------------+

Queue-Based Async Architecture

The core insight of the architecture is the decoupling of UI from agent logic via two async queues. This is explicitly modeled after Anthropic's Codex architecture (referenced in code comments).

  User Input                                               Display
      |                                                       ^
      v                                                       |
+-------------+    Operations    +----------------+   Events  +-----------+
| submission  | ───────────────> | submission_loop| ────────> |  event    |
| _queue      |                  | (agent_loop.py)|           |  _queue   |
+-------------+                  +----------------+           +-----------+

Operations: USER_INPUT, EXEC_APPROVAL, INTERRUPT, UNDO, COMPACT, SHUTDOWN
Events:     ready, assistant_chunk, tool_call, tool_output, approval_required,
            turn_complete, error, interrupted, compacted, tool_log, plan_update,
            tool_state_change, processing, shutdown

Why queues? This design enables:

  1. Both CLI and Web frontends to share the same core loop
  2. Non-blocking UI -- the display never waits for the agent
  3. Clean cancellation -- INTERRUPT is just another queue message
  4. Multiple SSE subscribers for the same session (web broadcast pattern)

Component Responsibilities

Agent Core (agent/core/)

Component File Responsibility
Agent Loop agent_loop.py The think-act cycle: call LLM, parse tool calls, check approvals, execute tools, repeat
Session session.py Per-session state: context manager, tool router, cancellation, event logging, trajectory persistence
ToolRouter tools.py Tool registration, dispatch (built-in + MCP), schema generation for LLM
ContextManager context_manager/manager.py Message history, system prompt loading, context window tracking, compaction (summarization), dangling tool call repair
DoomLoop doom_loop.py Detects repetitive tool call patterns, injects corrective prompts
LLM Params llm_params.py Builds litellm kwargs per model type (Anthropic direct, OpenAI direct, HF Router)
Session Uploader session_uploader.py Fire-and-forget trajectory upload to HF dataset repo
HF Router Catalog hf_router_catalog.py Model catalog fetch/cache for validation and context window discovery
Config config.py Pydantic config with env var substitution

Backend (backend/)

Component File Responsibility
App main.py FastAPI app, CORS, routers, static file serving
Agent Routes routes/agent.py REST + SSE endpoints for sessions, chat, approval, model config
Auth Routes routes/auth.py HuggingFace OAuth 2.0 flow
Session Manager session_manager.py Multi-session orchestration, EventBroadcaster fan-out, capacity management
Dependencies dependencies.py Auth middleware, token validation, dev mode bypass
Models models.py Pydantic request/response schemas

Frontend (frontend/src/)

Component File(s) Responsibility
SSE Transport lib/sse-chat-transport.ts Custom ChatTransport bridging backend SSE to Vercel AI SDK
Agent Chat Hook hooks/useAgentChat.ts Per-session chat state, side-channel event processing, reconnection
Session Store store/sessionStore.ts Session list CRUD, active session tracking (persisted)
Agent Store store/agentStore.ts Per-session processing state, research agents, panels, plans
Layout Store store/layoutStore.ts Sidebar, panel, theme state
Message Persistence lib/chat-message-store.ts localStorage for UIMessages (max 50 sessions)
Research Persistence lib/research-store.ts localStorage for research sub-agent state
Message Converter lib/convert-llm-messages.ts Backend litellm format -> Vercel AI SDK UIMessage format

Data Flow: User Message to Agent Response

sequenceDiagram
    participant U as User
    participant UI as Frontend/CLI
    participant Q as Submission Queue
    participant AL as Agent Loop
    participant CM as Context Manager
    participant LLM as LLM Provider
    participant TR as Tool Router
    participant T as Tools
    participant EQ as Event Queue
 
    U->>UI: "Fine-tune Llama on this dataset"
    UI->>Q: Submission(USER_INPUT, text)
    Q->>AL: dequeue
    AL->>CM: add_message(user, text)
 
    loop Agent Loop (max 300 iterations)
        AL->>CM: get_messages()
        CM-->>AL: [system, ...history]
        AL->>AL: check doom_loop(messages)
        AL->>LLM: acompletion(messages, tools)
        LLM-->>AL: response (streamed chunks)
        AL->>EQ: assistant_chunk events
        AL->>CM: add_message(assistant, content + tool_calls)
 
        alt Has tool calls
            AL->>AL: _needs_approval(tool_calls)?
 
            alt Needs approval
                AL->>EQ: approval_required event
                Note over AL: Pauses, waits for EXEC_APPROVAL
            end
 
            AL->>TR: call_tool(name, args) [parallel]
            TR->>T: dispatch to handler
            T-->>TR: (output, success)
            TR-->>AL: results
            AL->>EQ: tool_output events
            AL->>CM: add_message(tool, result) for each
        else No tool calls (text only)
            AL->>EQ: turn_complete
            Note over AL: Done, wait for next submission
        end
    end
 
    EQ->>UI: stream events
    UI->>U: render response

Data Flow: SSE Streaming (Web UI)

Frontend                          Backend                        Agent Core
   |                                |                               |
   | POST /api/chat/{id}           |                               |
   |  { text: "..." }             |                               |
   |------------------------------->|                               |
   |                                | subscribe(EventBroadcaster)  |
   |                                | submit(USER_INPUT)            |
   |                                |------------------------------>|
   |                                |                               |
   |  SSE: data: {"event_type":    |  Event Queue                  |
   |    "assistant_chunk",...}      |<------------------------------|
   |<-------------------------------|                               |
   |                                |                               |
   |  (pipeline:                   |                               |
   |   response.body               |                               |
   |   -> TextDecoderStream        |                               |
   |   -> SSEParserStream          |                               |
   |   -> EventToChunkStream       |                               |
   |   -> Vercel AI SDK)           |                               |
   |                                |                               |
   |  SSE: data: {"event_type":    |                               |
   |    "turn_complete",...}        |<------------------------------|
   |<-------------------------------|                               |
   |                                |                               |
   | Stream closes                 | unsubscribe                   |

Session Lifecycle

                    POST /api/session
                          |
                          v
                   +------+------+
                   | SessionMgr  |
                   | create()    |
                   | capacity ck |
                   +------+------+
                          |
              +-----------+-----------+
              |                       |
        +-----v-----+         +------v------+
        | Session    |         | ToolRouter  |
        | (UUID,     |         | (built-in + |
        | context,   |         |  MCP tools) |
        | config)    |         +------+------+
        +-----+------+                |
              |                       |
              +-------+-------+-------+
                      |
                +-----v-----+
                | _run_      |
                | session()  |   <-- asyncio task
                | loop       |
                +-----+------+
                      |
        +-------------+-------------+
        |                           |
  +-----v------+            +------v------+
  | submission  |            | Event       |
  | _queue      |            | Broadcaster |
  | (reads ops) |            | (fans out)  |
  +-------------+            +------+------+
                                    |
                              +-----+------+
                              | SSE subs   |
                              | (per-tab)  |
                              +------------+

  Termination triggers:
    - SHUTDOWN operation
    - DELETE /api/session/{id}
    - Unhandled exception (emergency save)

Context Window Management

+----------------------------------------------------------+
|                    Context Window                          |
|                                                           |
|  +--------+  +----------+  +------ ... ------+  +------+ |
|  | System |  | 1st User |  |   Middle msgs   |  |Recent| |
|  | Prompt |  | Message  |  |   (compactable)  |  |  5   | |
|  +--------+  +----------+  +-----------------+  +------+ |
|                                                           |
|  Max: model_limit - 10,000 (safety margin)               |
|  Compact at: max_context exceeded                         |
|  Summary budget: 10% of max_context                       |
+----------------------------------------------------------+

Compaction strategy:
  1. System prompt: ALWAYS preserved
  2. First user message: ALWAYS preserved (original task)
  3. Middle messages: Summarized by the LLM itself
  4. Last 5 messages: ALWAYS preserved (recent context)
  5. Result: [system] + [first_user] + [summary] + [recent_5]

Tool Approval Flow

Agent Loop: tool_call detected
      |
      v
  _needs_approval(tool)?
      |
  +---+---+
  |       |
  No      Yes
  |       |
  v       v
Execute   Emit approval_required event
  |              |
  |       +------v---------+
  |       |   CLI: prompt  |
  |       |   Web: inline  |
  |       |   approval UI  |
  |       +------+---------+
  |              |
  |       +------v---------+
  |       | User decision: |
  |       | approve/reject |
  |       | /yolo/feedback |
  |       +------+---------+
  |              |
  |       Submit EXEC_APPROVAL
  |              |
  v              v
Results -> add to context -> continue loop

Multi-Session Web Architecture

Frontend renders ALL session components simultaneously:

+-----------------------------------------------------+
| AppLayout                                            |
|  +----------+ +-----------------------------------+  |
|  | Session  | | SessionChat (session_1) [ACTIVE]  |  |
|  | Sidebar  | |   useAgentChat(session_1) running |  |
|  |          | |   renders: MessageList + ChatInput |  |
|  | - sess 1 | +-----------------------------------+  |
|  | - sess 2 | | SessionChat (session_2) [HIDDEN]  |  |
|  | - sess 3 | |   useAgentChat(session_2) running |  |
|  |          | |   renders: null                    |  |
|  |          | +-----------------------------------+  |
|  |          | | SessionChat (session_3) [HIDDEN]  |  |
|  |          | |   useAgentChat(session_3) running |  |
|  |          | |   renders: null                    |  |
|  +----------+ +-----------------------------------+  |
+-----------------------------------------------------+

Each session's useAgentChat hook runs continuously,
processing events even when not visible. Only the
active session renders UI components.