CodeDocs Vault

Config, Telemetry, Packaging

The infrastructure around the agent loop: how settings are resolved, what data leaves the machine, how the CLI is built and shipped, and what's tested.


1. Config (strix/config/config.py)

Centralized Config class with class-level attributes. Three-level precedence:

  1. Env varsos.environ (highest).
  2. ~/.strix/cli-config.json — persisted from prior sessions.
  3. Class defaults.

1.1 Tracked Settings

The _tracked_names() helper (config.py:59-69) auto-discovers every lowercase string-or-None attribute — so adding a setting is as easy as adding a class attribute:

strix_llm: str | None = None
strix_reasoning_effort: str = "high"
strix_llm_max_retries: str = "5"
strix_memory_compressor_timeout: str = "30"
llm_timeout: str = "300"
strix_telemetry: str | None = None
strix_otel_telemetry: str | None = None
strix_posthog_telemetry: str | None = None
strix_sandbox_execution_timeout: str = "120"
strix_sandbox_connect_timeout: str = "10"
strix_disable_browser: str | None = None
strix_image: str | None = None
strix_runtime_backend: str | None = None
perplexity_api_key: str | None = None
traceloop_base_url / traceloop_api_key / traceloop_headers

1.2 resolve_llm_config (config.py:190-216)

Special-case: model names starting with strix/ (e.g., strix/claude) auto-pick https://models.strix.ai/api/v1 as the base URL. Other providers fall back to provider-specific base-URL settings in order: llm_api_base, openai_api_base, litellm_base_url, ollama_api_base.

1.3 Load / Save / Apply

This gives the "remembers your API key across runs" UX without clobbering explicit overrides.


2. Telemetry

Two channels, both opt-out:

2.1 PostHog — Event Analytics (strix/telemetry/posthog.py)

HTTP POST to https://us.i.posthog.com with the public key phc_7rO3XRuNT5sgSKAl6HDIrWdSGh1COzxw0vxVIAR6vVZ. Events:

Event Payload
scan_started model, scan_mode, scan_type (whitebox/blackbox), interactive flag, instruction-presence, first_run marker (:76-94)
finding_reported severity only (:97-104)
scan_ended duration, vulnerability counts by severity, agent count, tool-execution count, token usage, cost (:107-130)
error error_type, optional error_msg (:133-137)

Base properties (always attached, :67-73): OS, architecture, Python version, Strix version.

Session ID is random per run — not a persistent user identifier (:18).

2.2 OpenTelemetry / Traceloop (strix/telemetry/tracer.py)

The Tracer class is initialized per run:

Event types emitted (tracer.py:187-268):

run.started, run.configured, agent.created, agent.status.updated, chat.message, tool.execution.started, tool.execution.updated, finding.created, finding.reviewed, run.completed.

Every event carries: trace_id, span_id, parent_span_id, actor (agent_id/name, tool_name), payload, status, error.

2.3 Sanitization (strix/telemetry/utils.py)

TelemetrySanitizer uses scrubadub for PII + custom regex for secrets/tokens.

2.4 Feature Flags (strix/telemetry/flags.py:1-24)

Three knobs (0/false/no/off disables):

The per-channel flags fall back to the master if unset.

2.5 OTEL Attribute Pruning (telemetry/utils.py:183-203)

To keep the local JSONL compact, large LLM payloads (llm.input, llm.output, gen_ai.prompt.*, gen_ai.completion.*, llm.input_messages.*, llm.output_messages.*) are dropped from spans before export. The count of filtered attributes is stored in strix.filtered_attributes_count so you can see that something was dropped, just not what.

This is essential — traceloop instrumentation on litellm would otherwise record every full prompt + completion, which would bloat files and leak secrets past the sanitizer.


3. Resource Paths (strix/utils/resource_paths.py:1-14)

The helper that makes dual pip/pyinstaller deployment work:

def get_strix_resource_path(*parts: str) -> Path:
    frozen_base = getattr(sys, "_MEIPASS", None)  # PyInstaller temp dir
    if frozen_base:
        base = Path(frozen_base) / "strix"
        if base.exists():
            return base.joinpath(*parts)
    # Development / pip-install mode
    base = Path(__file__).resolve().parent.parent  # repo_root/strix
    return base.joinpath(*parts)

Used to locate:

One code path; works whether Strix was pip-installed (files under site-packages/) or pyinstaller-frozen (files under _MEIPASS/strix).


4. Packaging (strix.spec)

PyInstaller spec that produces standalone binaries.

4.1 Data Files (strix.spec:10-27)

Bundled alongside the binary:

4.2 Hidden Imports (strix.spec:35-146)

Explicit list because pyinstaller can't statically detect dynamic imports used by litellm for every provider — OpenAI, Anthropic, Vertex, Bedrock, Azure plus Strix's own modules (agents, llm, runtime, telemetry, tools, skills).

4.3 Exclusions (strix.spec:148-199)

Kept out of the frozen binary (reduces size + surface):


5. Tests (tests/)

~23 Python files across 9 subdirs. Coverage is uneven — focused on infrastructure, sparse on the agent core.

Area Files Notes
config/ 1 Config loading, LLM env change detection, Traceloop var persistence
telemetry/ 3 Tracer JSONL output, sanitization, flags, OTEL span pruning
tools/ 5 Skill loading, argument parsing, agent graph (whitebox mode), tool registration
llm/ 2 LLM init side-effect tests, OTEL callback isolation
interface/ 1 Git diff-scope resolution
agents/ (empty) No direct tests
skills/ (empty) No direct tests
runtime/ (empty) No direct tests

Patterns

What's not tested:

The agent loop itself relies heavily on the XBEN benchmark suite (external repo) for regression testing.


6. Benchmarks (benchmarks/README.md)

Points at the usestrix/benchmarks repo. The README reports:

No in-repo harness — benchmarks run as a separate project.


7. CI / CD (.github/workflows/build-release.yml)

Triggered by:

Build matrix (4 platforms)

Steps

  1. checkout, setup Python 3.12, install uv
  2. uv sync --frozen
  3. pyinstaller strix.spec --noconfirm
  4. Extract version from pyproject.toml
  5. Create archive (.tar.gz / .zip)
  6. Attach to GitHub release

What's missing

No PR/push test workflow is committed. Linting/testing is expected to run via the pre-commit config and the Makefile during local dev. This is a notable gap — merges land without CI validation.


8. Makefile

All commands use uv run for reproducible execution:

Target Purpose
setup-dev Install dev deps + pre-commit hooks
install Production only (uv sync --no-dev)
dev-install All deps (uv sync)
format Ruff formatter
lint Ruff + pylint
type-check mypy + pyright (both strict)
security Bandit audit
check-all format + lint + type-check + security
test / test-cov pytest +/- coverage
pre-commit Run pre-commit on all files
dev format + lint + type-check + test
clean Remove caches, .pyc, htmlcov/

9. Scripts (scripts/)


10. Observations

Good ideas:

Pitfalls: