CodeDocs Vault

3. Architecture

High-Level Architecture

                                   +------------------+
                                   |   Web Browser /   |
                                   |   CLI Terminal    |
                                   +--------+---------+
                                            |
                              HTTP/WebSocket |
                                            v
                    +-----------------------------------------------+
                    |              FastAPI Server                    |
                    |  openhands/server/app.py (V0)                 |
                    |  openhands/app_server/v1_router.py (V1)       |
                    |                                               |
                    |  +------------------+  +------------------+   |
                    |  | REST API Routes  |  | Socket.IO Events |   |
                    |  | (CRUD, settings) |  | (real-time stream)|  |
                    |  +------------------+  +------------------+   |
                    +-------------------+---------------------------+
                                        |
                                        v
                    +-----------------------------------------------+
                    |         Conversation Manager                   |
                    |  openhands/server/conversation_manager/        |
                    |                                               |
                    |  Manages agent sessions, lifecycle,           |
                    |  concurrency (max_concurrent_conversations)   |
                    +-------------------+---------------------------+
                                        |
                          +-------------+-------------+
                          |                           |
                          v                           v
          +-------------------------------+  +------------------+
          |      Agent Controller         |  |    Event Stream  |
          |  controller/agent_controller  |  |  events/stream   |
          |                               |  |                  |
          |  - Main agent loop            |  |  - Event bus     |
          |  - State management           |  |  - Persistence   |
          |  - Delegation                 |  |  - Subscriptions |
          |  - Stuck detection            |  |  - Replay        |
          |  - Security checks            |  +------------------+
          +------+----------+-------------+         ^
                 |          |                       |
        +--------+   +------+-------+              |
        v            v              |              |
+-------------+ +----------+       |    +---------+---------+
|    Agent     | |  Memory  |       |    |     File Store    |
| (CodeAct,   | | memory/  |       |    | storage/          |
|  Browsing,  | |          |       |    | (local, S3, GCS)  |
|  ReadOnly)  | | Condenser|       |    +-------------------+
+------+------+ | Micro-   |       |
       |        | agents   |       |
       v        +----------+       |
+-------------+                    |
|     LLM     |                    |
| llm/llm.py  |                    |
|             |                    |
| - LiteLLM   |                    |
| - Retry     |                    |
| - Metrics   |                    |
| - FnCall    |                    |
|   Converter |                    |
+-------------+                    |
                                   |
          +------------------------+
          |
          v
+-----------------------------------------------+
|               Runtime                          |
|  runtime/base.py (abstract)                    |
|                                                |
|  +------------------+  +------------------+    |
|  | DockerRuntime    |  | K8sRuntime       |    |
|  | (primary)        |  | (enterprise)     |    |
|  +------------------+  +------------------+    |
|  +------------------+  +------------------+    |
|  | LocalRuntime     |  | RemoteRuntime    |    |
|  | (development)    |  | (cloud)          |    |
|  +------------------+  +------------------+    |
|                                                |
|  Runs inside sandbox container:                |
|  +------------------------------------------+ |
|  | Action Execution Server (FastAPI)         | |
|  | runtime/action_execution_server.py        | |
|  |                                           | |
|  | - Bash session (persistent shell)         | |
|  | - File operations (read/write/edit)       | |
|  | - IPython kernel                          | |
|  | - Browser (Playwright + BrowserGym)       | |
|  | - MCP proxy                               | |
|  +------------------------------------------+ |
+------------------------------------------------+

Core Components and Their Responsibilities

1. Agent Controller (openhands/controller/agent_controller.py, 1392 lines)

The central orchestrator. Responsibilities:

Key methods:

Method Line Purpose
_step() 863 Core execution step -- calls agent, checks guards, publishes action
on_event() 454 Event stream subscriber callback
_handle_action() 515 Routes actions to appropriate handlers
set_agent_state_to() 673 Manages state transitions
start_delegate() 735 Creates child controller for sub-agent
end_delegate() 796 Collects delegate results, resumes parent
_is_stuck() 1129 Delegates to StuckDetector

2. Agent (openhands/controller/agent.py, 191 lines)

Abstract base class for all agents. Uses a registry pattern for dynamic agent lookup.

# Registration happens via decorator or class attribute
Agent.register("CodeActAgent", CodeActAgent)
 
# Lookup by name
agent_cls = Agent.get_cls("CodeActAgent")

Concrete implementations in openhands/agenthub/:

Agent File Purpose
CodeActAgent agenthub/codeact_agent/codeact_agent.py Primary agent -- bash, file editing, browsing, Python
BrowsingAgent agenthub/browsing_agent/browsing_agent.py Web browsing specialist
ReadOnlyAgent agenthub/readonly_agent/readonly_agent.py Read-only file operations
VisualBrowsingAgent agenthub/visualbrowsing_agent/ Browser with visual understanding
DummyAgent agenthub/dummy_agent/agent.py Testing/demo agent

3. Event System (openhands/events/)

Event sourcing is the backbone of the architecture. Every interaction is recorded as an immutable event.

Event (base dataclass)
├── Action (agent or user intent)
│   ├── MessageAction         -- chat message
│   ├── CmdRunAction          -- shell command
│   ├── IPythonRunCellAction  -- Python code
│   ├── FileReadAction        -- read file
│   ├── FileWriteAction       -- write file
│   ├── FileEditAction        -- edit file (str_replace)
│   ├── BrowseInteractiveAction -- browser interaction
│   ├── AgentDelegateAction   -- delegate to sub-agent
│   ├── AgentFinishAction     -- task completed
│   ├── AgentThinkAction      -- reasoning (logged, not executed)
│   ├── RecallAction          -- retrieve microagent knowledge
│   ├── CondensationAction    -- compress conversation history
│   ├── MCPAction             -- MCP tool call
│   └── ChangeAgentStateAction -- state transition
│
└── Observation (result of action)
    ├── CmdOutputObservation       -- command output
    ├── IPythonRunCellObservation   -- Python output
    ├── FileReadObservation         -- file contents
    ├── FileEditObservation         -- edit result
    ├── BrowserOutputObservation    -- browser state
    ├── ErrorObservation            -- error details
    ├── AgentDelegateObservation    -- sub-agent result
    ├── RecallObservation           -- microagent knowledge
    ├── AgentCondensationObservation-- condensation result
    └── LoopDetectionObservation    -- stuck detection alert

EventStream (events/stream.py, 291 lines):

Subscriber types (EventStreamSubscriber enum):

AGENT_CONTROLLER, RESOLVER, SERVER, RUNTIME, MEMORY, MAIN, TEST

4. Runtime System (openhands/runtime/)

Provides sandboxed execution environments. The key abstraction:

class Runtime(ABC):
    async def connect(self) -> None: ...
    def run(self, action: CmdRunAction) -> Observation: ...
    def run_ipython(self, action: IPythonRunCellAction) -> Observation: ...
    def read(self, action: FileReadAction) -> Observation: ...
    def write(self, action: FileWriteAction) -> Observation: ...
    def edit(self, action: FileEditAction) -> Observation: ...
    def browse(self, action: BrowseURLAction) -> Observation: ...
    def browse_interactive(self, action: BrowseInteractiveAction) -> Observation: ...
    async def call_tool_mcp(self, action: MCPAction) -> Observation: ...

DockerRuntime (primary, runtime/impl/docker/docker_runtime.py):

5. LLM Layer (openhands/llm/)

Wraps LLM providers behind a unified interface with extensive resilience features.

LLM (llm.py, 874 lines)
├── RetryMixin (retry_mixin.py, 108 lines)
│   └── tenacity-based exponential backoff
├── DebugMixin (debug_mixin.py)
│   └── Prompt/response logging
├── Metrics (metrics.py, 284 lines)
│   └── Cost, tokens, latency tracking
├── FnCallConverter (fn_call_converter.py, 979 lines)
│   └── Native ↔ text-based function call conversion
├── ModelFeatures (model_features.py, 173 lines)
│   └── Pattern-based feature detection per model
└── StreamingLLM / AsyncLLM
    └── Async and streaming variants

6. Memory System (openhands/memory/)

Memory (memory.py, 405 lines)
├── Microagent loader (global + user + repo)
├── RecallAction handler
│   ├── WORKSPACE_CONTEXT recall (first message)
│   └── KNOWLEDGE recall (trigger-based)
└── RecallObservation emitter

ConversationMemory (conversation_memory.py)
├── Event → Message converter
├── Tool call completion tracking
└── Vision content handling

Condenser (condenser/)
├── LLMSummarizingCondenser  -- LLM-generated summaries
├── StructuredSummaryCondenser -- Structured field extraction
├── ObservationMaskingCondenser -- Masks old observations
├── ConversationWindowCondenser -- Sliding window
├── AmortizedForgettingCondenser -- Gradual forgetting
└── NoOpCondenser -- No compression

7. Server & API (openhands/server/ and openhands/app_server/)

Dual architecture during V0→V1 migration:

V0 (Legacy):

V1 (New):

Component Interaction Diagram

sequenceDiagram
    participant User as User (Browser/CLI)
    participant Server as FastAPI Server
    participant CM as ConversationManager
    participant AC as AgentController
    participant Agent as CodeActAgent
    participant LLM as LLM (LiteLLM)
    participant ES as EventStream
    participant Mem as Memory
    participant RT as Runtime (Docker)
    participant Sandbox as Sandbox Container
 
    User->>Server: POST /api/conversations (create)
    Server->>CM: attach_to_conversation()
    CM->>RT: create & connect runtime
    RT->>Sandbox: docker create + start
    Sandbox-->>RT: Action Execution Server ready
 
    User->>Server: WebSocket connect (conversation_id)
    Server->>CM: join_conversation()
    CM-->>User: replay existing events
 
    User->>Server: send MessageAction
    Server->>ES: add_event(MessageAction, SOURCE=USER)
    ES->>AC: on_event(MessageAction)
 
    AC->>Mem: RecallAction(WORKSPACE_CONTEXT)
    Mem-->>ES: RecallObservation (repo info, microagents)
    ES->>AC: on_event(RecallObservation)
 
    loop Agent Loop
        AC->>Agent: step(state)
        Agent->>LLM: completion(messages, tools)
        LLM-->>Agent: tool_call (e.g., execute_bash)
        Agent-->>AC: CmdRunAction
 
        AC->>AC: security_check(action)
        AC->>ES: add_event(CmdRunAction, SOURCE=AGENT)
        ES->>RT: on_event(CmdRunAction)
        RT->>Sandbox: HTTP POST /execute
        Sandbox-->>RT: command output
        RT->>ES: add_event(CmdOutputObservation)
        ES->>AC: on_event(CmdOutputObservation)
        ES->>Server: emit('oh_event')
        Server->>User: WebSocket event
    end
 
    Agent-->>AC: AgentFinishAction
    AC->>ES: add_event(AgentFinishAction)
    ES->>Server: emit('oh_event')
    Server->>User: task complete

Data Flow: Action Processing Pipeline

User Message
    │
    v
EventStream.add_event(MessageAction)
    │
    ├── Serialize to JSON
    ├── Redact secrets
    ├── Assign ID + timestamp
    ├── Persist to FileStore
    └── Queue for async dispatch
         │
         v
    AgentController.on_event()
         │
         ├── Add to state.history
         ├── _handle_message_action()
         │   └── Create RecallAction → Memory
         │       └── RecallObservation (workspace context)
         └── should_step() → True
              │
              v
         _step()
              │
              ├── Check: state == RUNNING?
              ├── Check: no pending_action?
              ├── Check: iteration/budget limits?
              ├── Check: not stuck?
              │
              ├── agent.step(state) → Action
              │   │
              │   ├── ConversationMemory.process_events()
              │   │   └── Convert events → LLM messages
              │   │
              │   ├── Condenser.condensed_history()
              │   │   └── Maybe compress old events
              │   │
              │   ├── LLM.completion(messages, tools)
              │   │   ├── Format messages (cache, vision)
              │   │   ├── Mock function calling if needed
              │   │   ├── Call litellm.completion()
              │   │   ├── Track metrics (cost, tokens, latency)
              │   │   └── Convert response → Actions
              │   │
              │   └── Return Action (or queue multiple)
              │
              ├── Security analysis (risk assessment)
              ├── Confirmation mode check
              └── EventStream.add_event(action)
                   │
                   └── Runtime subscriber receives action
                        │
                        v
                   Execute in sandbox
                        │
                        v
                   EventStream.add_event(observation)
                        │
                        └── Back to AgentController.on_event()

Multi-Agent Delegation Flow

AgentController (Parent)
│
├── Agent produces AgentDelegateAction
│   └── {agent: "BrowsingAgent", inputs: {task: "..."}}
│
├── start_delegate()
│   ├── Create new Agent instance (BrowsingAgent)
│   ├── Create child AgentController
│   │   ├── delegate_level = parent + 1
│   │   ├── Shared metrics (cost tracking)
│   │   ├── Shared event_stream
│   │   ├── is_delegate = True (no subscription)
│   │   └── start_id = current stream position
│   └── Send MessageAction to child
│
├── Parent pauses (should_step returns False when delegate active)
│
├── Child AgentController runs:
│   ├── Child agent.step() → actions
│   ├── Actions executed via runtime
│   ├── Observations received
│   └── Eventually: AgentFinishAction
│
└── end_delegate()
    ├── Close child controller
    ├── Extract outputs from child state
    ├── Create AgentDelegateObservation
    ├── Publish to event stream
    └── Parent resumes normal loop

State Machine

                    ┌──────────┐
                    │ LOADING  │
                    └────┬─────┘
                         │
                    ┌────v─────┐
              ┌─────┤ RUNNING  ├─────┐
              │     └────┬─────┘     │
              │          │           │
    ┌─────────v───┐ ┌────v─────┐ ┌──v──────────────────┐
    │  PAUSED     │ │ AWAITING │ │ AWAITING             │
    │ (user pause)│ │ USER     │ │ USER_CONFIRMATION    │
    └─────────┬───┘ │ INPUT    │ │ (security check)     │
              │     └────┬─────┘ └──┬──────────┬────────┘
              │          │          │          │
              └──────────┼──────────┘   ┌──────v──────┐
                         │              │  USER_      │
                         │              │  CONFIRMED/ │
                    ┌────v─────┐        │  REJECTED   │
                    │ RUNNING  │◄───────┘             │
                    └──┬───┬───┘                      │
                       │   │                          │
              ┌────────┘   └────────┐                 │
              v                    v                  │
        ┌──────────┐         ┌──────────┐             │
        │ FINISHED │         │  ERROR   │             │
        └──────────┘         └──────────┘             │
                                   ^                  │
                                   │                  │
                             ┌─────┴──────┐           │
                             │RATE_LIMITED│           │
                             └────────────┘           │
                                                      │
                             ┌────────────┐           │
                             │  STOPPED   │◄──────────┘
                             │ (user stop)│  (if rejected)
                             └────────────┘