CLI & Textual TUI
strix is a terminal program; the interface layer is everything between
"user hits return" and the agent loop starting — plus, in interactive mode,
a Textual-based live dashboard that renders agent activity as it happens.
1. Entry Point
Declared in pyproject.toml:52:
[project.scripts]
strix = "strix.interface.main:main"main() (strix/interface/main.py:540-637) sequence:
- Startup: pick Windows event-loop policy if needed, call
parse_arguments(). - Config override: if
--configgiven, load JSON over the in-processConfigclass. - Dependency check:
check_docker_installed(). - Warm-up:
pull_docker_image()— pulls the sandbox image if cache-missed.validate_environment()— checksSTRIX_LLMis set; API key optional (some providers use IAM).warm_up_llm()— sends a "Reply with just 'OK'" request with a 300s timeout to fail-fast on bad credentials/endpoints.- Persists config to
~/.strix/cli-config.jsonso the next run doesn't re-prompt.
- Target processing:
generate_run_name(targets)producesstrix_runs/<slug>_<hex>; repo targets cloned to/tmp/strix_repos/<run>/...; local sources collected for diff-scope. - Diff-scope resolution:
resolve_diff_scope_context()returns an instruction block (merged into user instructions) describing which files changed in the PR. - Dispatch: non-interactive →
run_cli(args), interactive →run_tui(args). Both wrapped in signal-handling and exception-logger to PostHog.
2. CLI Arguments
parse_arguments() (main.py:266-425):
| Flag | Type | Default | Purpose |
|---|---|---|---|
--target / -t |
str (append) | REQUIRED | Repo URL, domain, URL, IP, or local path. Repeatable for multi-target. |
--instruction |
str | None | Inline guidance ("focus on IDOR"). |
--instruction-file |
str | None | Read instructions from file. Mutually exclusive with --instruction. |
--non-interactive / -n |
flag | False | Headless CLI mode (no TUI). |
--scan-mode / -m |
quick / standard / deep |
deep |
Loads the matching skill, affects reasoning effort. |
--scope-mode |
auto / diff / full |
auto |
PR-diff analysis trigger. |
--diff-base |
str | None | Git ref baseline for diff mode. |
--config |
path | None | Override ~/.strix/cli-config.json. |
--version / -v |
flag | — | Print version. |
Target Type Inference (interface/utils.py:1085-1150)
Heuristic order:
git@,git://,.gitsuffix → repository- URL with scheme/query, or a GitHub/GitLab host → web_application
- IPv4/IPv6 literal → ip_address
- Existing directory path → local_code
- Contains
.and no path → domain → auto-https://
This means the user doesn't have to specify target type — Strix figures it out.
3. Config Bootstrapping
Three-level precedence (implemented across strix/config/config.py and
interface/main.py:52-255):
- Environment variables (highest) —
STRIX_LLM,LLM_API_KEY,LLM_API_BASE,PERPLEXITY_API_KEY,STRIX_REASONING_EFFORT, etc. --configfile — if passed.~/.strix/cli-config.json— persisted across runs.- Defaults from the
Configclass.
Change detection: _llm_env_changed() (config.py:72-83) clears stale
saved config when env LLM creds differ from what's on disk. File is
saved with chmod 0600 on Unix.
Validation flow bails out with a clear panel if:
STRIX_LLMmissing → "set STRIX_LLM=…".- Warm-up LLM call fails → "check connectivity/creds/model name".
4. Interactive vs Non-Interactive
Different code paths; same agent.
4.1 run_cli — Headless (interface/cli.py:23-206)
- Linear output via Rich panels.
- Live stats line every 2s: vuln count, agent count, tokens, cost.
- Callback
tracer.vulnerability_found_callbackprints each finding as it's reported. - At scan end: final report panel,
exit(2)if vulnerabilities found,exit(0)otherwise. - Signal handlers (SIGINT/SIGTERM/SIGHUP) → flush tracer + container
teardown +
sys.exit(1). atexit.register(cleanup_on_exit)belt-and-suspenders.
Suited for CI, Docker jobs, SSH-less environments.
4.2 run_tui — Textual (interface/tui.py:685-2096)
Full dashboard. Agent runs in a background thread; UI polls tracer every 350ms.
Layout
┌────────────────────────────────────────────────────────────┐
│ Strix v0.8.3 │
├────────────────────────────────┬───────────────────────────┤
│ │ Agents Tree │
│ │ ⚪ RootAgent (2) │
│ Chat / activity stream │ 🟢 SQLi-validate (1) │
│ │ ⏸ XSS-recon │
│ (renders per-tool widgets, ├───────────────────────────┤
│ streamed LLM text, │ Vulnerabilities │
│ streaming tool calls) │ 🔴 critical SQLi-login │
│ │ 🟠 high IDOR-users │
│ ├───────────────────────────┤
│ │ Stats │
│ │ Tokens: 128k Cost: $2.13 │
├────────────────────────────────┴───────────────────────────┤
│ > chat input (Enter = send, Shift+Enter = newline) │
└────────────────────────────────────────────────────────────┘
Widgets
- SplashScreen — animated banner, ~4.5s shine effect
(
tui.py:96-193). - Agents Tree — hierarchical tree of agent nodes with status icons (⚪ running, 🟢 done, 🔴 failed, ⏸ waiting) and a vuln count in brackets. Selecting a node switches the chat view.
- Chat Display — renders events per agent_id, merges streamed LLM
output, caches rendered output by
(agent_id, content_length)to avoid re-rendering on every tick (tui.py:1122-1159). - Chat Input — custom
ChatTextArea; Enter=send, Shift+Enter=newline, auto-resizes 1-8 lines. - Stats Display — token usage, cost, model name, updated every tick.
- Vulnerabilities Panel — clickable list, severity-colored dots.
Click →
VulnerabilityDetailScreenmodal with a copy-markdown button.
Modals
HelpScreen(F1),QuitScreen(Ctrl+Q),StopAgentScreen(Esc),VulnerabilityDetailScreen(click a finding).
5. Streaming Parser (interface/streaming_parser.py:43-126)
Parses incomplete LLM output in real time so the TUI can render partial tool calls as they arrive.
Input (an incoming chunk while the LLM is still generating):
I'll probe the login form next.
<function=browser_action>
<parameter=action>click
Output — list of StreamSegment:
[
StreamSegment(type="text", content="I'll probe the login form next."),
StreamSegment(type="tool", tool_name="browser_action",
args={"action": "click"}, is_complete=False),
]Parser walks the string, alternating between text and <function=X>
blocks. For each function block:
- Captures body until
</function>(complete) or next<function>(abandoned / incomplete). - Applies
normalize_tool_format()to shim legacy<invoke>syntax. - Parses parameters via a small XML-ish recognizer
(
_parse_streaming_params) with HTML-entity unescape.
Caching (tui.py:1122-1159): the TUI keys rendered output by
(agent_id, len(content)); when the length grows, it re-parses and
re-renders only that region.
6. Tool Renderers (interface/tool_components/)
Per-tool visual rendering. Each subclass of BaseToolRenderer registers
with @register_tool_renderer and returns a Textual Static widget.
| Tool pattern | Renderer | What it shows |
|---|---|---|
browser_* |
BrowserRenderer |
Tab list, screenshots (embedded), last action |
terminal_*, run_command |
TerminalRenderer |
Pyte-parsed terminal replay with ANSI colors |
proxy_* |
ProxyRenderer |
HTTP request/response diffs, headers, body preview |
python_* |
PythonRenderer |
Code + exec result, with syntax highlighting |
file_edit_* |
FileEditRenderer |
Before/after diff |
load_skill_* |
LoadSkillRenderer |
"Loaded: sql_injection, idor" |
scan_start_info, subagent_start_info |
ScanInfoRenderer |
Scan metadata panels |
create_vulnerability_report |
ReportingRenderer |
Severity-colored finding card |
finish_* |
FinishRenderer |
Completion status |
web_search_* |
WebSearchRenderer |
Search results |
| user/agent messages | UserMessageRenderer, AgentMessageRenderer |
Plain + markdown |
This is where the "watching an agent hack" experience comes from — the TUI doesn't just show JSON, it shows terminal output replays, screenshot previews, and diff views inline.
7. Scan-Mode & Scope-Mode Resolution
7.1 Scan Mode
--scan-mode {quick, standard, deep} maps directly into
LLMConfig.scan_mode, which the LLM layer uses to:
- Pick the corresponding skill (
scan_modes/<mode>) to auto-load (llm/llm.py:111-125). - Set default reasoning effort:
quick → medium, elsehigh(llm/llm.py:74-82).
7.2 Scope Mode (interface/utils.py:988-1070)
resolve_diff_scope_context() decides whether to focus on a PR diff.
full— full codebase, no diff scope.auto— enabled in PR/headless CI environments (GitHub Actions env vars detected).diff— forced, even outside CI.
Diff-base resolution order (utils.py:657-684):
- Explicit
--diff-base. GITHUB_BASE_REF→refs/remotes/origin/<base_ref>.- GitHub Actions event payload base SHA.
origin/HEADsymbolic ref.- Fallback to
origin/mainororigin/master. - If still nothing → error with a hint ("use
actions/checkoutwithfetch-depth: 0").
Output is a DiffScopeResult with an instruction_block that gets
merged into the user's instructions:
The user is requesting a review of a Pull Request.
Instruction: Direct your analysis primarily at the changes in the listed files…
Repository Scope: my-repo
Base reference: origin/main
Merge base: abc123…
Primary Focus (changed files to analyze):
- src/auth.py
- tests/auth_test.py
This text becomes part of the agent's task description; the agent treats the listed files as the "primary focus" for triage.
8. Output Artifacts
Every run writes to strix_runs/<slug>_<hex>/. The Tracer creates
the dir and appends:
| File | Contents |
|---|---|
events.jsonl |
Canonical event stream — run.started, agent.created, chat.message, tool.execution.*, finding.created, etc. with trace_id/span_id correlation |
messages.json |
Full conversation history per agent |
tool_executions.json |
Structured log of every tool call + result |
vulnerability_reports.json |
Findings with CVSS, PoC, remediation |
scan_metadata.json |
Targets, instructions, diff-scope decision |
Screenshots from the browser renderer live inside the sandbox's
/home/pentester/output (or wherever the tool wrote them) and are
base64-embedded into the JSON artifacts on the host, since nothing is
synced back from the container filesystem directly.
9. Signal Handling
CLI mode (interface/cli.py:111-125)
signal.signal(signal.SIGINT, signal_handler) # Ctrl+C
signal.signal(signal.SIGTERM, signal_handler) # kill
signal.signal(signal.SIGHUP, signal_handler) # terminal close
atexit.register(cleanup_on_exit) # backstopsignal_handler → flush tracer → teardown runtime → sys.exit(1).
TUI mode (interface/tui.py:766-781, 1960-1968)
Ctrl+Q or Ctrl+C → QuitScreen modal → on confirm:
def action_custom_quit():
if self._scan_thread.is_alive():
self._scan_stop_event.set()
self._scan_thread.join(timeout=1.0)
tracer.cleanup()
self.exit()The agent thread polls _scan_stop_event cooperatively. After 1s
timeout the TUI exits anyway — a deliberately short fuse so the TUI
never hangs.
Exit Codes
0— success (no vulns in non-interactive mode).1— interrupted / error / cleanup failure.2— vulnerabilities found (non-interactive only,main.py:630-633).
The 2 exit code is what lets CI fail the PR build on findings.
10. Design Observations
Good ideas:
- Dual-mode entry (CLI for CI, TUI for humans) sharing the agent core. Clean separation.
- Live streaming-parser rendering — partial tool calls as XML attributes appear, not just finished ones. Gives the TUI a real "watching it hack" feel.
- Per-tool renderers — the tool system's XML schema pairs naturally with a component-per-tool UI. Extending a tool = adding a schema + impl + renderer; everything is colocated conceptually.
- Agent tree with live status icons — surfaces the multi-agent graph in a way the user can reason about.
- Exit-code 2 for findings — simple, conventional CI failure signal.
Pitfalls:
- The TUI runs the agent in a thread, but the agent is async — which
means a dedicated event loop runs per-thread. Thread-safety of the
shared module-level agent graph (
_agent_graphdicts) is reasonable in practice but isn't explicitly synchronized. - UI polling every 350ms is fine but means ~3 Hz responsiveness for stats. Under heavy agent activity, the tracer event log can be large and re-rendering non-trivial.
- Output artifacts include full LLM conversation history in JSON. For very long scans, these files can be huge (tens of MB). No rotation or compression.
- Signal handling differs between CLI and TUI enough that the TUI's 1-second thread join can leak subprocesses (a running sqlmap inside the container continues until container teardown). This is fine in practice but worth noting.