06 - Backend API & Frontend UI

Backend (FastAPI)

Application Setup (`backend/main.py`)

FastAPI app with CORS for localhost:5173 and localhost:3000 (Vite dev servers)
Two routers: /api (agent) and /auth (OAuth)
In production, serves built frontend from ../static/ as SPA
Port 7860 (HF Spaces default)

Complete API Surface (`backend/routes/agent.py`)

Endpoint	Method	Auth	Purpose
`GET /api/health`	GET	No	Health check (active sessions, max sessions)
`GET /api/health/llm`	GET	No	LLM provider reachability (1-token probe)
`GET /api/config/model`	GET	No	Current model + available models list
`POST /api/config/model`	POST	Yes	Set global default LLM model
`POST /api/title`	POST	Yes	Generate 6-word session title from first message
`POST /api/session`	POST	Yes	Create new agent session
`GET /api/session/{id}`	GET	Yes	Get session info (processing state, message count)
`POST /api/session/{id}/model`	POST	Yes	Set per-session model (tab-scoped)
`GET /api/sessions`	GET	Yes	List user's sessions
`DELETE /api/session/{id}`	DELETE	Yes	Delete session
`POST /api/chat/{id}`	POST	Yes	Primary SSE endpoint: submit + stream
`GET /api/events/{id}`	GET	Yes	Subscribe to events (reconnection)
`POST /api/interrupt/{id}`	POST	Yes	Interrupt agent loop
`GET /api/session/{id}/messages`	GET	Yes	Full message history
`POST /api/undo/{id}`	POST	Yes	Undo last turn
`POST /api/truncate/{id}`	POST	Yes	Truncate to before Nth user message
`POST /api/compact/{id}`	POST	Yes	Trigger context compaction
`POST /api/shutdown/{id}`	POST	Yes	Shutdown session

Available models (hardcoded in agent.py:39-62):

Claude Opus 4.6 (Anthropic) -- recommended
MiniMax M2.7 (HuggingFace) -- recommended
Kimi K2.6 (HuggingFace)
GLM 5.1 (HuggingFace)

Primary SSE Endpoint: `POST /api/chat/{id}` (line 319)

This is the main interaction endpoint. It:

Subscribes to the session's EventBroadcaster before submitting (ensures no events are missed)
Accepts either { text: "..." } (user input) or { approvals: [...] } (tool approvals)
Returns an SSE stream (text/event-stream)
Keepalive every 15 seconds (SSE comment ": keepalive\n\n")
Stream terminates on: turn_complete, approval_required, error, interrupted, shutdown
X-Accel-Buffering: no header for nginx proxy compatibility

Session Manager (`backend/session_manager.py`)

Capacity limits: 200 total sessions, 10 per user.

EventBroadcaster (line 41): Fan-out pattern. One source queue (from agent core) to N subscriber queues (one per SSE connection). Each subscriber gets its own asyncio.Queue.

_run_session() (line 221): Per-session asyncio task that:

Creates the EventBroadcaster
Loops reading from submission_queue with 1s timeout
Calls process_submission() from agent core
Sets is_processing flag around each submission
Cleans up sandbox on exit

Thread safety: Session creation uses asyncio.Lock for capacity checking. Session/ToolRouter constructors run via asyncio.to_thread() since they may block.

Authentication (`backend/routes/auth.py`, `backend/dependencies.py`)

OAuth 2.0 Authorization Code Flow:

Browser                  Backend                  HuggingFace
   |                        |                         |
   | GET /auth/login        |                         |
   |----------------------->|                         |
   |                        | Generate CSRF state     |
   |  302 Redirect          |                         |
   |<-----------------------|                         |
   |                        |                         |
   | GET /authorize          |                         |
   |------------------------------------------------->|
   |                        |                         |
   | 302 Callback           |                         |
   |<-------------------------------------------------|
   |                        |                         |
   | GET /auth/callback?code=...                      |
   |----------------------->|                         |
   |                        | POST /token (exchange)  |
   |                        |------------------------>|
   |                        | { access_token }        |
   |                        |<------------------------|
   |                        | GET /userinfo           |
   |                        |------------------------>|
   |                        | { user data }           |
   |                        |<------------------------|
   |  Set-Cookie: hf_access_token (HttpOnly, 7d)      |
   |<-----------------------|                         |

Scopes: openid profile read-repos write-repos contribute-repos manage-repos inference-api jobs write-discussions

Dev mode: When OAUTH_CLIENT_ID is not set, auth is bypassed entirely. get_current_user() returns DEV_USER with user_id: "dev".

Token caching: Validated tokens cached for 5 minutes in-memory (dependencies.py:21).

Frontend (React + TypeScript)

Application Structure

frontend/src/
  App.tsx                    # Root: auth check + layout
  main.tsx                   # Entry: theme + providers
  theme.ts                   # MUI dark/light themes

  hooks/
    useAgentChat.ts          # Per-session chat orchestration (743 lines)
    useAuth.ts               # Authentication state
    useOrgMembership.ts      # Org membership polling

  lib/
    sse-chat-transport.ts    # Custom SSE -> AI SDK transport (397 lines)
    chat-message-store.ts    # localStorage message persistence
    research-store.ts        # localStorage research state persistence
    convert-llm-messages.ts  # Backend format -> AI SDK format

  store/
    sessionStore.ts          # Session list (Zustand, persisted)
    agentStore.ts            # Per-session state (Zustand)
    layoutStore.ts           # Layout preferences (Zustand, partial persist)

  components/
    Layout/AppLayout.tsx     # Main layout with sidebar + panel
    SessionChat.tsx          # Per-session chat (active vs hidden)
    WelcomeScreen/           # Onboarding checklist
    Chat/
      ChatInput.tsx          # Input + model selector
      MessageList.tsx        # Scrollable message list
      AssistantMessage.tsx   # Message with grouped tool calls
      ToolCallGroup.tsx      # Tool approval + research display (1117 lines)
      ActivityStatusBar.tsx  # Animated status indicator
    CodePanel/
      CodePanel.tsx          # Right panel for scripts/output/plans
    SessionSidebar/
      SessionSidebar.tsx     # Session list with create/delete

SSE Streaming Pipeline (`lib/sse-chat-transport.ts`)

The custom SSEChatTransport bridges backend SSE events to the Vercel AI SDK's UIMessageChunk streaming interface:

POST /api/chat/{id}
        |
        v
  response.body (ReadableStream<Uint8Array>)
        |
        v
  TextDecoderStream
        |
        v
  createSSEParserStream()     -- parses "data: {...}\n\n" into AgentEvent objects
        |
        v
  createEventToChunkStream()  -- maps AgentEvent -> UIMessageChunk
        |
        v
  Vercel AI SDK (useChat)     -- renders into React state

Event mapping (sse-chat-transport.ts:78-269):

Backend Event	AI SDK Chunk(s)	Side Effect
`ready`	(none)	`onReady()` callback
`processing`	`start` + `start-step`	`onProcessing()`
`assistant_chunk`	`text-start` (first) + `text-delta`	Updates streaming state
`assistant_stream_end`	`text-end`	Marks text complete
`tool_call`	`tool-input-start` + `tool-input-available`	`onToolCallPanel()`
`tool_output`	`tool-output-available` or `tool-output-error`	`onToolOutputPanel()`
`approval_required`	`tool-input-start` + `tool-approval-request`	`onApprovalRequired()`
`tool_state_change`	(stores job URL/status)	State update
`turn_complete`	`finish-step` + `finish(stop)`	Clears processing
`error`	`finish-step` + `finish(error)`	Shows error
`interrupted`	`finish-step` + `finish(stop)`	Marks cancelled

Per-Session Chat Hook (`hooks/useAgentChat.ts`)

This 743-line hook is the core frontend orchestrator. Key responsibilities:

Side-channel callbacks (line 44-299): A useMemo block creating callbacks that update per-session state via agentStore.updateSession(). Handles:

Research sub-agent state (onToolLog): Parses tool_log events to track per-agent progress, tool counts, token counts, elapsed time
Approval panel data (onApprovalRequired): Builds script preview or JSON display for the CodePanel
Tool call panel (onToolCallPanel): Shows running tool's script/args in the panel
Tool output panel (onToolOutputPanel): Updates panel with results

Backend hydration (line 360-425): On mount, fetches full message history from /api/session/{id}/messages, converts to UIMessages via llmMessagesToUIMessages(), and restores pending approval state.

Wake-from-sleep reconnection (line 435-608): On visibilitychange, re-hydrates messages, subscribes to GET /api/events/{id} for live SSE, and polls messages every 3 seconds for sync.

Key actions:

handleSendMessage: Submit text -> set processing state -> auto-title from first message
undoLastTurn: REST call to /api/undo/{id} + remove last turn from UI
approveTools: Store edited scripts -> send approval responses via AI SDK
stop: POST /api/interrupt/{id} (keeps SSE open for remaining events)
editAndRegenerate: Truncate backend + frontend, re-send edited text

State Management (3 Zustand Stores)

`sessionStore` (persisted to localStorage)

State: sessions: SessionMeta[], activeSessionId: string | null
Actions: createSession, deleteSession, switchSession, updateSessionTitle, setNeedsAttention
Key: hf-agent-sessions

`agentStore` (not persisted, except specific fields)

State: Per-session state map + mirrored flat fields for active session
Per-session: isProcessing, activityStatus, panelData, panelView, plan, researchAgents, researchSteps, researchStats
Pattern: updateSession(sessionId, updates) patches session entry AND mirrors to flat fields if active
Persisted fields: editedScripts, jobUrls, jobStatuses, toolErrors, rejectedTools (to localStorage)

`layoutStore` (partially persisted)

State: isLeftSidebarOpen, isRightPanelOpen, rightPanelWidth, themeMode
Persisted: Only themeMode

Research Sub-Agent Visualization

The frontend tracks parallel research agents in real-time:

+-------------------------------------------+
| research "Finding fine-tuning approach"    |
|   [running . 5 tools . 12.4k tokens . 18s]|
|                                            |
|   > Exploring HF docs: trl                |
|   > Reading paper: 2401.12345             |
|   > Finding examples: sft training        |
|   > Inspecting dataset: mlabonne/...  [*] |
+-------------------------------------------+

Data flow:

Backend research tool sends tool_log events with agent_id and label
useAgentChat.onToolLog() parses these into ResearchAgentState entries in agentStore
ToolCallGroup.tsx renders per-agent stats chips and rolling step displays
useSecondTick() hook forces re-render every second for live elapsed time
State persisted to localStorage via research-store.ts for page refresh survival

Tool Approval UI (`components/Chat/ToolCallGroup.tsx`)

The most complex UI component (1,117 lines). Handles:

Batch approval: When multiple tools pending, "Approve all" / "Reject all" header
Individual approval: Per-tool Approve/Reject with optional feedback
Script preview: Click to view and edit scripts in CodePanel
Hardware pricing: Shows GPU costs for sandbox_create and hf_jobs
Edited script tracking: "(edited)" badge when user modifies a script before approval
Auto-follow panel: Automatically shows the currently running tool in CodePanel; user can "lock" a specific tool

Code Panel (`components/CodePanel/CodePanel.tsx`)

Right-side panel with:

Script/Output toggle: Switch between input and output views
Inline editing: Overlay textarea on syntax-highlighted code
Syntax highlighting: Python via react-syntax-highlighter, Markdown via react-markdown
Log processing: Cleans up progress bars and download lines (utils/logProcessor.ts)
Plan display: Bottom section with status icons (completed, in_progress, pending)
Drag-to-resize: Desktop: inline panel (min 300px, max 60% viewport). Mobile: bottom drawer at 75vh

Message Rendering Pipeline

Backend LLM messages (litellm format)
        |
        v
  llmMessagesToUIMessages()    -- convert-llm-messages.ts
  (consecutive assistant msgs merged, tool results paired)
        |
        v
  UIMessage[] (Vercel AI SDK format)
        |
        v
  MessageList.tsx
  (auto-scroll, MutationObserver for streaming)
        |
        v
  UserMessage / AssistantMessage
        |
        v
  groupParts()                 -- groups consecutive tools
        |
        v
  MarkdownContent / ToolCallGroup

Wake-from-Sleep Reconnection

When the browser tab becomes visible after sleeping:

Fetch /api/session/{id}/messages -- get full backend state
Convert to UIMessages and reconcile with local state
If is_processing: subscribe to GET /api/events/{id} for live SSE
Start polling messages every 3 seconds for sync
On turn_complete or similar terminal event: stop polling, close SSE

This handles the case where the agent was working while the tab was asleep.

CLI Interface (`agent/main.py`)

Startup Sequence

1. Particle logo animation (braille characters converge to form text)
2. Screen clear
3. CRT boot sequence (typewriter + glitch + scanlines):
   - "User: {hf_username}"
   - "Model: {model_name}"
   - "Tools: loading..."
4. Tool count overwrite (ANSI cursor-up, types actual count)
5. Ready for input

Terminal Display System (`agent/utils/terminal_display.py`)

Three rendering layers:

Rich layer: Themed console (_THEME with warm gold accents), markdown rendering, panels
ANSI escape layer: Direct cursor manipulation for SubAgentDisplayManager live regions and init-done animation
Typewriter layer: Async character-by-character rendering with variable timing (2ms newlines, 4ms chars, occasional 15ms pauses)

SubAgentDisplayManager (line 170-314): Manages multiple concurrent sub-agent displays using terminal escape codes. Shows at most 4 tool-call lines per agent, with compact mode when multiple agents are active. Redraws every 1 second to update elapsed timers.

Slash Commands

Command	Action
`/help`	Show available commands
`/undo`	Remove last turn from conversation
`/compact`	Trigger context compaction
`/model <id>`	Switch LLM model (with preflight validation)
`/yolo`	Toggle auto-approval mode
`/effort <level>`	Set reasoning effort (low/medium/high)
`/status`	Show session info (model, messages, context)
`/quit`	Exit