CodeDocs Vault

06 - Backend API & Frontend UI

Backend (FastAPI)

Application Setup (backend/main.py)

Complete API Surface (backend/routes/agent.py)

Endpoint Method Auth Purpose
GET /api/health GET No Health check (active sessions, max sessions)
GET /api/health/llm GET No LLM provider reachability (1-token probe)
GET /api/config/model GET No Current model + available models list
POST /api/config/model POST Yes Set global default LLM model
POST /api/title POST Yes Generate 6-word session title from first message
POST /api/session POST Yes Create new agent session
GET /api/session/{id} GET Yes Get session info (processing state, message count)
POST /api/session/{id}/model POST Yes Set per-session model (tab-scoped)
GET /api/sessions GET Yes List user's sessions
DELETE /api/session/{id} DELETE Yes Delete session
POST /api/chat/{id} POST Yes Primary SSE endpoint: submit + stream
GET /api/events/{id} GET Yes Subscribe to events (reconnection)
POST /api/interrupt/{id} POST Yes Interrupt agent loop
GET /api/session/{id}/messages GET Yes Full message history
POST /api/undo/{id} POST Yes Undo last turn
POST /api/truncate/{id} POST Yes Truncate to before Nth user message
POST /api/compact/{id} POST Yes Trigger context compaction
POST /api/shutdown/{id} POST Yes Shutdown session

Available models (hardcoded in agent.py:39-62):

  1. Claude Opus 4.6 (Anthropic) -- recommended
  2. MiniMax M2.7 (HuggingFace) -- recommended
  3. Kimi K2.6 (HuggingFace)
  4. GLM 5.1 (HuggingFace)

Primary SSE Endpoint: POST /api/chat/{id} (line 319)

This is the main interaction endpoint. It:

  1. Subscribes to the session's EventBroadcaster before submitting (ensures no events are missed)
  2. Accepts either { text: "..." } (user input) or { approvals: [...] } (tool approvals)
  3. Returns an SSE stream (text/event-stream)
  4. Keepalive every 15 seconds (SSE comment ": keepalive\n\n")
  5. Stream terminates on: turn_complete, approval_required, error, interrupted, shutdown
  6. X-Accel-Buffering: no header for nginx proxy compatibility

Session Manager (backend/session_manager.py)

Capacity limits: 200 total sessions, 10 per user.

EventBroadcaster (line 41): Fan-out pattern. One source queue (from agent core) to N subscriber queues (one per SSE connection). Each subscriber gets its own asyncio.Queue.

_run_session() (line 221): Per-session asyncio task that:

  1. Creates the EventBroadcaster
  2. Loops reading from submission_queue with 1s timeout
  3. Calls process_submission() from agent core
  4. Sets is_processing flag around each submission
  5. Cleans up sandbox on exit

Thread safety: Session creation uses asyncio.Lock for capacity checking. Session/ToolRouter constructors run via asyncio.to_thread() since they may block.

Authentication (backend/routes/auth.py, backend/dependencies.py)

OAuth 2.0 Authorization Code Flow:

Browser                  Backend                  HuggingFace
   |                        |                         |
   | GET /auth/login        |                         |
   |----------------------->|                         |
   |                        | Generate CSRF state     |
   |  302 Redirect          |                         |
   |<-----------------------|                         |
   |                        |                         |
   | GET /authorize          |                         |
   |------------------------------------------------->|
   |                        |                         |
   | 302 Callback           |                         |
   |<-------------------------------------------------|
   |                        |                         |
   | GET /auth/callback?code=...                      |
   |----------------------->|                         |
   |                        | POST /token (exchange)  |
   |                        |------------------------>|
   |                        | { access_token }        |
   |                        |<------------------------|
   |                        | GET /userinfo           |
   |                        |------------------------>|
   |                        | { user data }           |
   |                        |<------------------------|
   |  Set-Cookie: hf_access_token (HttpOnly, 7d)      |
   |<-----------------------|                         |

Scopes: openid profile read-repos write-repos contribute-repos manage-repos inference-api jobs write-discussions

Dev mode: When OAUTH_CLIENT_ID is not set, auth is bypassed entirely. get_current_user() returns DEV_USER with user_id: "dev".

Token caching: Validated tokens cached for 5 minutes in-memory (dependencies.py:21).


Frontend (React + TypeScript)

Application Structure

frontend/src/
  App.tsx                    # Root: auth check + layout
  main.tsx                   # Entry: theme + providers
  theme.ts                   # MUI dark/light themes

  hooks/
    useAgentChat.ts          # Per-session chat orchestration (743 lines)
    useAuth.ts               # Authentication state
    useOrgMembership.ts      # Org membership polling

  lib/
    sse-chat-transport.ts    # Custom SSE -> AI SDK transport (397 lines)
    chat-message-store.ts    # localStorage message persistence
    research-store.ts        # localStorage research state persistence
    convert-llm-messages.ts  # Backend format -> AI SDK format

  store/
    sessionStore.ts          # Session list (Zustand, persisted)
    agentStore.ts            # Per-session state (Zustand)
    layoutStore.ts           # Layout preferences (Zustand, partial persist)

  components/
    Layout/AppLayout.tsx     # Main layout with sidebar + panel
    SessionChat.tsx          # Per-session chat (active vs hidden)
    WelcomeScreen/           # Onboarding checklist
    Chat/
      ChatInput.tsx          # Input + model selector
      MessageList.tsx        # Scrollable message list
      AssistantMessage.tsx   # Message with grouped tool calls
      ToolCallGroup.tsx      # Tool approval + research display (1117 lines)
      ActivityStatusBar.tsx  # Animated status indicator
    CodePanel/
      CodePanel.tsx          # Right panel for scripts/output/plans
    SessionSidebar/
      SessionSidebar.tsx     # Session list with create/delete

SSE Streaming Pipeline (lib/sse-chat-transport.ts)

The custom SSEChatTransport bridges backend SSE events to the Vercel AI SDK's UIMessageChunk streaming interface:

POST /api/chat/{id}
        |
        v
  response.body (ReadableStream<Uint8Array>)
        |
        v
  TextDecoderStream
        |
        v
  createSSEParserStream()     -- parses "data: {...}\n\n" into AgentEvent objects
        |
        v
  createEventToChunkStream()  -- maps AgentEvent -> UIMessageChunk
        |
        v
  Vercel AI SDK (useChat)     -- renders into React state

Event mapping (sse-chat-transport.ts:78-269):

Backend Event AI SDK Chunk(s) Side Effect
ready (none) onReady() callback
processing start + start-step onProcessing()
assistant_chunk text-start (first) + text-delta Updates streaming state
assistant_stream_end text-end Marks text complete
tool_call tool-input-start + tool-input-available onToolCallPanel()
tool_output tool-output-available or tool-output-error onToolOutputPanel()
approval_required tool-input-start + tool-approval-request onApprovalRequired()
tool_state_change (stores job URL/status) State update
turn_complete finish-step + finish(stop) Clears processing
error finish-step + finish(error) Shows error
interrupted finish-step + finish(stop) Marks cancelled

Per-Session Chat Hook (hooks/useAgentChat.ts)

This 743-line hook is the core frontend orchestrator. Key responsibilities:

Side-channel callbacks (line 44-299): A useMemo block creating callbacks that update per-session state via agentStore.updateSession(). Handles:

Backend hydration (line 360-425): On mount, fetches full message history from /api/session/{id}/messages, converts to UIMessages via llmMessagesToUIMessages(), and restores pending approval state.

Wake-from-sleep reconnection (line 435-608): On visibilitychange, re-hydrates messages, subscribes to GET /api/events/{id} for live SSE, and polls messages every 3 seconds for sync.

Key actions:

State Management (3 Zustand Stores)

sessionStore (persisted to localStorage)

agentStore (not persisted, except specific fields)

layoutStore (partially persisted)

Research Sub-Agent Visualization

The frontend tracks parallel research agents in real-time:

+-------------------------------------------+
| research "Finding fine-tuning approach"    |
|   [running . 5 tools . 12.4k tokens . 18s]|
|                                            |
|   > Exploring HF docs: trl                |
|   > Reading paper: 2401.12345             |
|   > Finding examples: sft training        |
|   > Inspecting dataset: mlabonne/...  [*] |
+-------------------------------------------+

Data flow:

  1. Backend research tool sends tool_log events with agent_id and label
  2. useAgentChat.onToolLog() parses these into ResearchAgentState entries in agentStore
  3. ToolCallGroup.tsx renders per-agent stats chips and rolling step displays
  4. useSecondTick() hook forces re-render every second for live elapsed time
  5. State persisted to localStorage via research-store.ts for page refresh survival

Tool Approval UI (components/Chat/ToolCallGroup.tsx)

The most complex UI component (1,117 lines). Handles:

Code Panel (components/CodePanel/CodePanel.tsx)

Right-side panel with:

Message Rendering Pipeline

Backend LLM messages (litellm format)
        |
        v
  llmMessagesToUIMessages()    -- convert-llm-messages.ts
  (consecutive assistant msgs merged, tool results paired)
        |
        v
  UIMessage[] (Vercel AI SDK format)
        |
        v
  MessageList.tsx
  (auto-scroll, MutationObserver for streaming)
        |
        v
  UserMessage / AssistantMessage
        |
        v
  groupParts()                 -- groups consecutive tools
        |
        v
  MarkdownContent / ToolCallGroup

Wake-from-Sleep Reconnection

When the browser tab becomes visible after sleeping:

  1. Fetch /api/session/{id}/messages -- get full backend state
  2. Convert to UIMessages and reconcile with local state
  3. If is_processing: subscribe to GET /api/events/{id} for live SSE
  4. Start polling messages every 3 seconds for sync
  5. On turn_complete or similar terminal event: stop polling, close SSE

This handles the case where the agent was working while the tab was asleep.


CLI Interface (agent/main.py)

Startup Sequence

1. Particle logo animation (braille characters converge to form text)
2. Screen clear
3. CRT boot sequence (typewriter + glitch + scanlines):
   - "User: {hf_username}"
   - "Model: {model_name}"
   - "Tools: loading..."
4. Tool count overwrite (ANSI cursor-up, types actual count)
5. Ready for input

Terminal Display System (agent/utils/terminal_display.py)

Three rendering layers:

  1. Rich layer: Themed console (_THEME with warm gold accents), markdown rendering, panels
  2. ANSI escape layer: Direct cursor manipulation for SubAgentDisplayManager live regions and init-done animation
  3. Typewriter layer: Async character-by-character rendering with variable timing (2ms newlines, 4ms chars, occasional 15ms pauses)

SubAgentDisplayManager (line 170-314): Manages multiple concurrent sub-agent displays using terminal escape codes. Shows at most 4 tool-call lines per agent, with compact mode when multiple agents are active. Redraws every 1 second to update elapsed timers.

Slash Commands

Command Action
/help Show available commands
/undo Remove last turn from conversation
/compact Trigger context compaction
/model <id> Switch LLM model (with preflight validation)
/yolo Toggle auto-approval mode
/effort <level> Set reasoning effort (low/medium/high)
/status Show session info (model, messages, context)
/quit Exit