CodeDocs Vault

Runtime & Docker Sandbox

Every Strix scan boots a dedicated Docker container that runs the real pentest tooling (Kali + Caido + Playwright). The host process talks to it over HTTP with a bearer token. This doc covers the container lifecycle, the FastAPI tool server, the Kali image, networking, isolation, and failure modes.


1. The Host/Container Split

┌─ Host process ────────────────────────────────────────┐
│ Agent orchestration, LLM calls, state management       │
│ + DockerRuntime (container lifecycle)                  │
│ + httpx.AsyncClient (tool-call transport)              │
└───────────────────────────────────────────────────────┘
                    │ HTTPS POST /execute
                    │ Authorization: Bearer <token>
                    ▼
┌─ Docker container (strix-scan-<id>) ──────────────────┐
│ Kali rolling + security tools                          │
│ FastAPI tool_server (port 48081 in container)          │
│ Caido proxy (port 48080 in container)                  │
│ /workspace = target code (tar-uploaded)                │
│ pentester user with passwordless sudo                  │
└───────────────────────────────────────────────────────┘

Security boundary. The container runs with NET_ADMIN + NET_RAW capabilities (strix/runtime/docker_runtime.py:144) so tools like nmap can craft raw packets. The host retains API keys, agent state, and orchestration — a compromise of the target can only reach the sandbox. Bearer-token auth is enforced on every request to /execute (runtime/tool_server.py:42-57).

Trade-off. All agents in a single scan share the same container. Isolation is per-scan, not per-agent. Strix documents this explicitly in the system prompt (agents/StrixAgent/system_prompt.jinja:233-238).


2. Container Lifecycle

2.1 Creation (strix/runtime/docker_runtime.py:111-173)

flowchart TD
    A[DockerRuntime.create_sandbox] --> B{Container already<br/>tagged with scan_id?}
    B -- yes --> C[start if stopped,<br/>recover token/ports]
    B -- no --> D[_find_available_port × 2<br/>mint 32-byte bearer token]
    D --> E[docker.containers.run<br/>sleep infinity]
    E --> F[Copy target into /workspace<br/>via put_archive tar upload]
    F --> G[_wait_for_tool_server<br/>ping /health up to 30×]
    C --> G
    G --> H[Return sandbox_info<br/>{sandbox_id, port, token}]

Key bits:

2.2 Target Upload (docker_runtime.py:222-248)

_copy_local_directory_to_container tars the local directory in memory, uploads via put_archive(), then chowns to pentester:pentester. Multi-source support (:266) — multiple --target flags can each be uploaded to distinct subdirs under /workspace.

2.3 Startup Sequence Inside the Container

containers/docker-entrypoint.sh runs on container launch:

  1. Caido starts on port 48080, entrypoint waits for GraphQL readiness (lines 24-46).
  2. Guest-login token fetched from Caido GraphQL (lines 50-74).
  3. Temporary Caido project created + selected (lines 79-94).
  4. System-wide proxy env vars written to /etc/profile.d/proxy.sh, /etc/environment, /etc/wgetrc, shell RCs (lines 115-144) — every subsequent tool inherits http_proxy=http://127.0.0.1:48080.
  5. Caido's self-signed CA certificate imported into the system trust store + Firefox NSS DB (lines 149-152) so HTTPS is transparently decrypted.
  6. Tool server launched via uvicorn on the configured port with Bearer-token auth (lines 154-180).
  7. /health poll before returning (line 169).

2.4 Teardown (docker_runtime.py:334-352)

cleanup() spawns a detached docker rm -f subprocess — non-blocking on exit. Containers can be manually recovered by scan_id on next run.


3. Tool Server (strix/runtime/tool_server.py)

A small FastAPI app that runs inside the container and dispatches LLM tool calls.

3.1 Endpoints

Endpoint Purpose
POST /execute (:86-127) Core dispatch — receives {agent_id, tool_name, kwargs}, looks up the tool in the registry, calls it, returns {result} or {error}
POST /register_agent (:130-135) Lightweight register-presence — used by the agent when a new subagent starts
GET /health (:138-147) Liveness probe — returns {status, sandbox_mode, auth_config, active_agents}

All endpoints require Authorization: Bearer <TOOL_SERVER_TOKEN> (:42-57). No endpoint accepts unauthenticated requests.

3.2 Execution Model

await asyncio.wait_for(
    asyncio.to_thread(fn, **kwargs),
    timeout=REQUEST_TIMEOUT,
)

3.3 Transport

Request / response are both JSON. No streaming. The host side waits up to 150s (120s server timeout + 30s buffer, strix/tools/executor.py:25) with a 10s connect timeout. Exceptions inside tools are caught and returned as strings in the response, never as HTTP 500s (tool_server.py:119-123).


4. The Kali Image

containers/Dockerfile, base kalilinux/kali-rolling:latest.

4.1 Installed Tools (by category)

Network + reconnaissance: nmap (with CAP_NET_RAW/CAP_NET_ADMIN setcap, Dockerfile:49), ncat, ndiff, dnsutils, whois, naabu, subfinder.

Web scanning + fuzzing: nuclei, httpx, katana, gospider, ffuf, arjun, dirsearch, wafw00f, wapiti.

Proxy: Caido v0.48.0 with a pre-generated 10-year self-signed CA (built during image build).

Browser: Playwright + Chromium (line 201).

Code analysis: tree-sitter parsers (Java, JS, Python, Go, Bash, JSON, YAML, TypeScript), semgrep, bandit, retire (JS deps), eslint, ast-grep (sg).

Secrets: trufflesecurity/trufflehog, gitleaks.

Image/container: trivy.

Custom: JS-Snooper, jsniper, jwt_tool, interactsh-client.

Runtimes: Python 3, uv, Go, Node/npm.

Utilities: tmux, parallel, jq, ripgrep, gdb.

4.2 Packaging

4.3 Size

Not aggressively minimized — expect 10–15GB uncompressed. The README notes "first run automatically pulls the sandbox Docker image". On a fast connection that's a 5–10 minute upfront cost.


5. Networking

5.1 Outbound

Container has full internet access via Docker's default bridge network. No egress policy beyond whatever Docker/host firewall provides.

5.2 Proxy Interception

Every tool that honors http_proxy/https_proxy (curl, wget, httpx, requests, etc.) automatically routes through Caido on 127.0.0.1:48080. Caido decrypts HTTPS via the pre-installed CA and persists the traffic to a project DB, which the proxy tool queries via GraphQL.

Playwright / Chromium honor the same env vars plus the NSS cert import, so browser traffic is also intercepted.

5.3 Ports

Both bound to 0.0.0.0 inside the container, accessible only to the host (docker.containers.run(ports=...)).


6. File System

Nothing is automatically persisted back to the host mid-run. The host tracks findings in strix_runs/<run_name>/*.json via the tracer, which records the agent's LLM conversation — the raw tool output is stored in those JSON logs rather than synced from the container FS. If the container is destroyed before the host extracts its files, they're gone.


7. Multi-Agent Isolation

All subagents reuse the root agent's sandbox. When create_agent spawns a child, it:

  1. Copies sandbox_id, sandbox_token, sandbox_info from parent's AgentState into the child's (agents_graph_actions.py:441-461).
  2. Does not call DockerRuntime.create_sandbox again.
  3. The child's executor targets the same tool_server instance, with the same bearer token — the server distinguishes agents by the agent_id field in the request body.

Per-agent state inside the container:


8. Failure Modes

Failure Behavior
Container dies mid-run Next tool call fails with ConnectError; host re-POSTs on retry; no auto-restart
Tool exceeds timeout asyncio.wait_for raises CancelledError; server returns "Tool timed out after Ns" as the error field; tool continues running in background until container GC
Docker daemon unreachable DockerRuntime raises on create_sandbox(); host exits with a clear message
Host process dies Container keeps running — orphaned. Next scan with same scan_id reuses it (_get_or_create_container). cleanup() spawns a detached docker rm -f subprocess
Bearer-token mismatch 401 from tool_server; host treats as RuntimeError; usually indicates a stale sandbox cache
Playwright browser crash Browser instance auto-relaunched on next browser_action
Caido proxy crash Entrypoint restart pattern — the entrypoint exits, container gets restarted by Docker if policy allows; otherwise traffic capture stops silently

State recovery: _recover_container_state (docker_runtime.py:72-85) extracts TOOL_SERVER_TOKEN + port mappings from the container's env / bindings metadata — so you can reattach to a running container across host process restarts.


9. Design Observations

Good ideas:

Potential pitfalls: