CLI & Textual TUI

strix is a terminal program; the interface layer is everything between "user hits return" and the agent loop starting — plus, in interactive mode, a Textual-based live dashboard that renders agent activity as it happens.

1. Entry Point

Declared in pyproject.toml:52:

[project.scripts]
strix = "strix.interface.main:main"

main() (strix/interface/main.py:540-637) sequence:

Startup: pick Windows event-loop policy if needed, call parse_arguments().
Config override: if --config given, load JSON over the in-process Config class.
Dependency check: check_docker_installed().
Warm-up:
- pull_docker_image() — pulls the sandbox image if cache-missed.
- validate_environment() — checks STRIX_LLM is set; API key optional (some providers use IAM).
- warm_up_llm() — sends a "Reply with just 'OK'" request with a 300s timeout to fail-fast on bad credentials/endpoints.
- Persists config to ~/.strix/cli-config.json so the next run doesn't re-prompt.
Target processing: generate_run_name(targets) produces strix_runs/<slug>_<hex>; repo targets cloned to /tmp/strix_repos/<run>/...; local sources collected for diff-scope.
Diff-scope resolution: resolve_diff_scope_context() returns an instruction block (merged into user instructions) describing which files changed in the PR.
Dispatch: non-interactive → run_cli(args), interactive → run_tui(args). Both wrapped in signal-handling and exception-logger to PostHog.

2. CLI Arguments

parse_arguments() (main.py:266-425):

Flag	Type	Default	Purpose
`--target` / `-t`	str (append)	REQUIRED	Repo URL, domain, URL, IP, or local path. Repeatable for multi-target.
`--instruction`	str	None	Inline guidance ("focus on IDOR").
`--instruction-file`	str	None	Read instructions from file. Mutually exclusive with `--instruction`.
`--non-interactive` / `-n`	flag	False	Headless CLI mode (no TUI).
`--scan-mode` / `-m`	`quick` / `standard` / `deep`	`deep`	Loads the matching skill, affects reasoning effort.
`--scope-mode`	`auto` / `diff` / `full`	`auto`	PR-diff analysis trigger.
`--diff-base`	str	None	Git ref baseline for diff mode.
`--config`	path	None	Override `~/.strix/cli-config.json`.
`--version` / `-v`	flag	—	Print version.

Target Type Inference (`interface/utils.py:1085-1150`)

Heuristic order:

git@, git://, .git suffix → repository
URL with scheme/query, or a GitHub/GitLab host → web_application
IPv4/IPv6 literal → ip_address
Existing directory path → local_code
Contains . and no path → domain → auto-https://

This means the user doesn't have to specify target type — Strix figures it out.

3. Config Bootstrapping

Three-level precedence (implemented across strix/config/config.py and interface/main.py:52-255):

Environment variables (highest) — STRIX_LLM, LLM_API_KEY, LLM_API_BASE, PERPLEXITY_API_KEY, STRIX_REASONING_EFFORT, etc.
--config file — if passed.
~/.strix/cli-config.json — persisted across runs.
Defaults from the Config class.

Change detection: _llm_env_changed() (config.py:72-83) clears stale saved config when env LLM creds differ from what's on disk. File is saved with chmod 0600 on Unix.

Validation flow bails out with a clear panel if:

STRIX_LLM missing → "set STRIX_LLM=…".
Warm-up LLM call fails → "check connectivity/creds/model name".

4. Interactive vs Non-Interactive

Different code paths; same agent.

4.1 `run_cli` — Headless (`interface/cli.py:23-206`)

Linear output via Rich panels.
Live stats line every 2s: vuln count, agent count, tokens, cost.
Callback tracer.vulnerability_found_callback prints each finding as it's reported.
At scan end: final report panel, exit(2) if vulnerabilities found, exit(0) otherwise.
Signal handlers (SIGINT/SIGTERM/SIGHUP) → flush tracer + container teardown + sys.exit(1).
atexit.register(cleanup_on_exit) belt-and-suspenders.

Suited for CI, Docker jobs, SSH-less environments.

4.2 `run_tui` — Textual (`interface/tui.py:685-2096`)

Full dashboard. Agent runs in a background thread; UI polls tracer every 350ms.

Layout

┌────────────────────────────────────────────────────────────┐
│  Strix                                            v0.8.3   │
├────────────────────────────────┬───────────────────────────┤
│                                │ Agents Tree               │
│                                │ ⚪ RootAgent (2)          │
│  Chat / activity stream        │   🟢 SQLi-validate (1)   │
│                                │   ⏸ XSS-recon            │
│  (renders per-tool widgets,    ├───────────────────────────┤
│   streamed LLM text,           │ Vulnerabilities           │
│   streaming tool calls)        │ 🔴 critical  SQLi-login   │
│                                │ 🟠 high       IDOR-users   │
│                                ├───────────────────────────┤
│                                │ Stats                     │
│                                │ Tokens: 128k  Cost: $2.13 │
├────────────────────────────────┴───────────────────────────┤
│ > chat input (Enter = send, Shift+Enter = newline)         │
└────────────────────────────────────────────────────────────┘

Widgets

SplashScreen — animated banner, ~4.5s shine effect (tui.py:96-193).
Agents Tree — hierarchical tree of agent nodes with status icons (⚪ running, 🟢 done, 🔴 failed, ⏸ waiting) and a vuln count in brackets. Selecting a node switches the chat view.
Chat Display — renders events per agent_id, merges streamed LLM output, caches rendered output by (agent_id, content_length) to avoid re-rendering on every tick (tui.py:1122-1159).
Chat Input — custom ChatTextArea; Enter=send, Shift+Enter=newline, auto-resizes 1-8 lines.
Stats Display — token usage, cost, model name, updated every tick.
Vulnerabilities Panel — clickable list, severity-colored dots. Click → VulnerabilityDetailScreen modal with a copy-markdown button.

Modals

HelpScreen (F1), QuitScreen (Ctrl+Q), StopAgentScreen (Esc), VulnerabilityDetailScreen (click a finding).

5. Streaming Parser (`interface/streaming_parser.py:43-126`)

Parses incomplete LLM output in real time so the TUI can render partial tool calls as they arrive.

Input (an incoming chunk while the LLM is still generating):

I'll probe the login form next.
<function=browser_action>
<parameter=action>click

Output — list of StreamSegment:

[
  StreamSegment(type="text", content="I'll probe the login form next."),
  StreamSegment(type="tool", tool_name="browser_action",
                args={"action": "click"}, is_complete=False),
]

Parser walks the string, alternating between text and <function=X> blocks. For each function block:

Captures body until </function> (complete) or next <function> (abandoned / incomplete).
Applies normalize_tool_format() to shim legacy <invoke> syntax.
Parses parameters via a small XML-ish recognizer (_parse_streaming_params) with HTML-entity unescape.

Caching (tui.py:1122-1159): the TUI keys rendered output by (agent_id, len(content)); when the length grows, it re-parses and re-renders only that region.

6. Tool Renderers (`interface/tool_components/`)

Per-tool visual rendering. Each subclass of BaseToolRenderer registers with @register_tool_renderer and returns a Textual Static widget.

Tool pattern	Renderer	What it shows
`browser_*`	`BrowserRenderer`	Tab list, screenshots (embedded), last action
`terminal_*`, `run_command`	`TerminalRenderer`	Pyte-parsed terminal replay with ANSI colors
`proxy_*`	`ProxyRenderer`	HTTP request/response diffs, headers, body preview
`python_*`	`PythonRenderer`	Code + exec result, with syntax highlighting
`file_edit_*`	`FileEditRenderer`	Before/after diff
`load_skill_*`	`LoadSkillRenderer`	"Loaded: sql_injection, idor"
`scan_start_info`, `subagent_start_info`	`ScanInfoRenderer`	Scan metadata panels
`create_vulnerability_report`	`ReportingRenderer`	Severity-colored finding card
`finish_*`	`FinishRenderer`	Completion status
`web_search_*`	`WebSearchRenderer`	Search results
user/agent messages	`UserMessageRenderer`, `AgentMessageRenderer`	Plain + markdown

This is where the "watching an agent hack" experience comes from — the TUI doesn't just show JSON, it shows terminal output replays, screenshot previews, and diff views inline.

7. Scan-Mode & Scope-Mode Resolution

7.1 Scan Mode

--scan-mode {quick, standard, deep} maps directly into LLMConfig.scan_mode, which the LLM layer uses to:

Pick the corresponding skill (scan_modes/<mode>) to auto-load (llm/llm.py:111-125).
Set default reasoning effort: quick → medium, else high (llm/llm.py:74-82).

7.2 Scope Mode (`interface/utils.py:988-1070`)

resolve_diff_scope_context() decides whether to focus on a PR diff.

full — full codebase, no diff scope.
auto — enabled in PR/headless CI environments (GitHub Actions env vars detected).
diff — forced, even outside CI.

Diff-base resolution order (utils.py:657-684):

Explicit --diff-base.
GITHUB_BASE_REF → refs/remotes/origin/<base_ref>.
GitHub Actions event payload base SHA.
origin/HEAD symbolic ref.
Fallback to origin/main or origin/master.
If still nothing → error with a hint ("use actions/checkout with fetch-depth: 0").

Output is a DiffScopeResult with an instruction_block that gets merged into the user's instructions:

The user is requesting a review of a Pull Request.
Instruction: Direct your analysis primarily at the changes in the listed files…

Repository Scope: my-repo
Base reference: origin/main
Merge base: abc123…
Primary Focus (changed files to analyze):
- src/auth.py
- tests/auth_test.py

This text becomes part of the agent's task description; the agent treats the listed files as the "primary focus" for triage.

8. Output Artifacts

Every run writes to strix_runs/<slug>_<hex>/. The Tracer creates the dir and appends:

File	Contents
`events.jsonl`	Canonical event stream — `run.started`, `agent.created`, `chat.message`, `tool.execution.*`, `finding.created`, etc. with trace_id/span_id correlation
`messages.json`	Full conversation history per agent
`tool_executions.json`	Structured log of every tool call + result
`vulnerability_reports.json`	Findings with CVSS, PoC, remediation
`scan_metadata.json`	Targets, instructions, diff-scope decision

Screenshots from the browser renderer live inside the sandbox's /home/pentester/output (or wherever the tool wrote them) and are base64-embedded into the JSON artifacts on the host, since nothing is synced back from the container filesystem directly.

9. Signal Handling

CLI mode (`interface/cli.py:111-125`)

signal.signal(signal.SIGINT, signal_handler)    # Ctrl+C
signal.signal(signal.SIGTERM, signal_handler)   # kill
signal.signal(signal.SIGHUP, signal_handler)    # terminal close
atexit.register(cleanup_on_exit)                # backstop

signal_handler → flush tracer → teardown runtime → sys.exit(1).

TUI mode (`interface/tui.py:766-781, 1960-1968`)

Ctrl+Q or Ctrl+C → QuitScreen modal → on confirm:

def action_custom_quit():
    if self._scan_thread.is_alive():
        self._scan_stop_event.set()
        self._scan_thread.join(timeout=1.0)
    tracer.cleanup()
    self.exit()

The agent thread polls _scan_stop_event cooperatively. After 1s timeout the TUI exits anyway — a deliberately short fuse so the TUI never hangs.

Exit Codes

0 — success (no vulns in non-interactive mode).
1 — interrupted / error / cleanup failure.
2 — vulnerabilities found (non-interactive only, main.py:630-633).

The 2 exit code is what lets CI fail the PR build on findings.

10. Design Observations

Good ideas:

Dual-mode entry (CLI for CI, TUI for humans) sharing the agent core. Clean separation.
Live streaming-parser rendering — partial tool calls as XML attributes appear, not just finished ones. Gives the TUI a real "watching it hack" feel.
Per-tool renderers — the tool system's XML schema pairs naturally with a component-per-tool UI. Extending a tool = adding a schema + impl + renderer; everything is colocated conceptually.
Agent tree with live status icons — surfaces the multi-agent graph in a way the user can reason about.
Exit-code 2 for findings — simple, conventional CI failure signal.

Pitfalls:

The TUI runs the agent in a thread, but the agent is async — which means a dedicated event loop runs per-thread. Thread-safety of the shared module-level agent graph (_agent_graph dicts) is reasonable in practice but isn't explicitly synchronized.
UI polling every 350ms is fine but means ~3 Hz responsiveness for stats. Under heavy agent activity, the tracer event log can be large and re-rendering non-trivial.
Output artifacts include full LLM conversation history in JSON. For very long scans, these files can be huge (tens of MB). No rotation or compression.
Signal handling differs between CLI and TUI enough that the TUI's 1-second thread join can leak subprocesses (a running sqlmap inside the container continues until container teardown). This is fine in practice but worth noting.