Hermes Agent - System Architecture
hermes-agent
Hermes Agent - System Architecture
High-Level Architecture
HERMES AGENT ARCHITECTURE
┌─────────────────────────────────────────────────────────────────────────┐
│ ENTRY POINTS │
│ │
│ hermes (CLI/TUI) hermes gateway hermes-acp batch_runner │
│ ┌──────────┐ ┌──────────┐ ┌─────────┐ ┌──────────┐ │
│ │ cli.py │ │gateway/ │ │acp_ │ │batch_ │ │
│ │ (10K LOC) │ │run.py │ │adapter/ │ │runner.py │ │
│ └─────┬─────┘ │(9.8K LOC)│ └────┬────┘ └────┬─────┘ │
│ │ └────┬─────┘ │ │ │
└─────────┼────────────────────┼──────────────────┼──────────────┼────────┘
│ │ │ │
▼ ▼ ▼ ▼
┌─────────────────────────────────────────────────────────────────────────┐
│ AGENT CORE │
│ │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ AIAgent (run_agent.py) │ │
│ │ 11.5K lines │ │
│ │ │ │
│ │ ┌─────────────┐ ┌────────────────┐ ┌──────────────────┐ │ │
│ │ │ System Prompt│ │ Conversation │ │ Tool Execution │ │ │
│ │ │ Builder │ │ Loop │ │ Engine │ │ │
│ │ │(prompt_ │ │(run_ │ │(_invoke_tool, │ │ │
│ │ │ builder.py) │ │ conversation())│ │ _execute_tool_ │ │ │
│ │ └──────┬───────┘ └───────┬────────┘ │ calls) │ │ │
│ │ │ │ └────────┬─────────┘ │ │
│ │ │ │ │ │ │
│ │ ┌──────┴───────┐ ┌──────┴────────┐ ┌────────┴─────────┐ │ │
│ │ │ Context │ │ Error Recovery│ │ Iteration Budget │ │ │
│ │ │ Compressor │ │ & Fallback │ │ & Rate Control │ │ │
│ │ └──────────────┘ └───────────────┘ └──────────────────┘ │ │
│ └──────────────────────────────────────────────────────────────┘ │
│ │
└────────────┬───────────────────┬───────────────────┬────────────────────┘
│ │ │
┌─────────┴─────┐ ┌────────┴────────┐ ┌──────┴───────┐
▼ ▼ ▼ ▼ ▼ ▼
┌──────┐ ┌──────────┐ ┌────────────────┐ ┌──────────────────┐
│ LLM │ │ Tool │ │ Persistence │ │ Platform │
│Provid│ │ Registry │ │ Layer │ │ Adapters │
│ ers │ │ │ │ │ │ │
│ │ │40+ tools │ │ SessionDB │ │ Telegram │
│OpenAI│ │in 7+ │ │ (SQLite+FTS5) │ │ Discord │
│Anthro│ │toolsets │ │ │ │ Slack │
│Gemini│ │ │ │ MemoryStore │ │ WhatsApp │
│Bedro│ │Terminal │ │ (MEMORY.md) │ │ Signal │
│ck │ │File │ │ │ │ Matrix │
│OpenRo│ │Browser │ │ SkillStore │ │ Email │
│uter │ │Code Exec │ │ (~/.hermes/ │ │ HomeAssistant │
│Nous │ │Delegate │ │ skills/) │ │ 18+ more... │
│Custom│ │Memory │ │ │ │ │
│ │ │MCP │ │ Config │ │ │
│ │ │Skills │ │ (config.yaml) │ │ │
└──────┘ └──────────┘ └────────────────┘ └──────────────────┘
Core Components and Responsibilities
1. Agent Core (run_agent.py)
The AIAgent class is the central orchestrator. It owns the conversation lifecycle:
| Responsibility | Method/Location |
|---|---|
| Constructor / provider detection | __init__() (line ~559) |
| Main conversation cycle | run_conversation() (line ~8103) |
| System prompt assembly | _build_system_prompt() via agent/prompt_builder.py |
| Streaming API calls | _interruptible_streaming_api_call() |
| Tool call validation & repair | Lines ~10380-10530 |
| Tool dispatch (concurrent/sequential) | _execute_tool_calls() (line ~10595) |
| Individual tool invocation | _invoke_tool() (line ~7182) |
| Error classification & recovery | classify_api_error() + 10-level recovery hierarchy |
| Context compression trigger | Via agent/context_compressor.py |
| Token usage tracking | normalize_usage() (line ~9219) |
2. Provider Layer (agent/)
Abstracts differences between LLM providers:
| File | Responsibility |
|---|---|
agent/anthropic_adapter.py |
Native Anthropic API: thinking budgets, prompt caching, output limits |
agent/bedrock_adapter.py |
AWS Bedrock Converse API adapter |
agent/auxiliary_client.py |
Side-task routing (compression, vision, search) with auto-detection |
agent/prompt_builder.py |
System prompt composition with skills index, memory, platform hints |
agent/context_compressor.py |
Proactive context compression when approaching token limits |
agent/memory_manager.py |
External memory provider orchestration (Honcho, Mem0, etc.) |
agent/skill_utils.py |
Skill matching, platform filtering, conditional activation |
3. Tool System (tools/)
Self-registering tools with runtime dispatch:
tools/registry.py ← Singleton registry (ToolEntry, ToolRegistry)
↑
tools/*.py ← Each calls registry.register() at import time
↑
model_tools.py ← Discovery: imports tool modules, triggers registration
↑
run_agent.py / cli.py ← Consumers: call get_tool_definitions(), handle_function_call()
4. Gateway (gateway/)
Multi-platform messaging router:
gateway/run.py ← GatewayRunner: lifecycle, message routing, commands
gateway/config.py ← Platform enum, session policy, gateway config
gateway/session.py ← Session state management, PII redaction
gateway/delivery.py ← Cron output delivery routing
gateway/stream_consumer.py ← Progressive message editing for streaming
gateway/platforms/base.py ← BasePlatformAdapter (abstract, 2133 lines)
gateway/platforms/*.py ← 26+ platform-specific adapters
5. Persistence Layer
| Store | Location | Purpose |
|---|---|---|
| SessionDB | hermes_state.py → SQLite WAL |
Conversation history, token counts, FTS5 search |
| MemoryStore | tools/memory_tool.py → ~/.hermes/memories/ |
Agent observations (MEMORY.md) + user profile (USER.md) |
| SkillStore | tools/skills_tool.py → ~/.hermes/skills/ |
Procedural knowledge as markdown+YAML |
| Config | hermes_cli/config.py → ~/.hermes/config.yaml |
User configuration |
| Cron Jobs | cron/jobs.py → ~/.hermes/cron/jobs.json |
Scheduled task definitions |
Data Flow: User Message to Response
sequenceDiagram
participant U as User
participant E as Entry Point<br/>(CLI/Gateway/ACP)
participant A as AIAgent
participant P as Prompt Builder
participant L as LLM Provider
participant T as Tool Registry
participant S as SessionDB
U->>E: Send message
E->>A: run_conversation(user_message)
A->>P: _build_system_prompt()
P-->>A: system prompt + skills index + memory snapshot
A->>S: Load/check session history
S-->>A: conversation messages[]
loop Tool-calling loop (max 90 iterations)
A->>A: Prepare messages (inject memory, plugin context)
A->>A: Sanitize (strip orphaned tool results)
A->>L: Streaming API call
L-->>A: Stream response chunks
A->>A: Parse response → assistant_message
A->>A: Validate tool_calls (repair names, check JSON)
alt Has tool_calls
A->>T: _execute_tool_calls()
T-->>A: tool results[]
A->>A: Append tool results to messages
A->>A: Check context compression threshold
else No tool_calls (final response)
A-->>E: Return final_response
end
end
A->>S: Persist session (messages, tokens, costs)
E-->>U: Display responseComponent Interaction Map
Dependency Direction
hermes_constants.py ←── (no deps, import-safe)
↑
hermes_logging.py ←── hermes_constants
↑
hermes_state.py ←── hermes_constants, hermes_logging
↑
tools/registry.py ←── (no deps)
↑
tools/*.py ←── registry, hermes_constants, utils
↑
model_tools.py ←── tools/registry, tools/*.py (import-time discovery)
↑
agent/*.py ←── model_tools, hermes_state, hermes_constants
↑
run_agent.py ←── agent/*, model_tools, hermes_state, tools/*
↑
cli.py ←── run_agent, hermes_cli/*
gateway/run.py ←── run_agent, gateway/*, hermes_state
batch_runner.py ←── run_agent, model_tools
Key Interfaces Between Components
AIAgent ↔ ToolRegistry:
- Agent calls
get_tool_definitions()to build tool schemas for LLM - Agent calls
handle_function_call(name, args)to execute tools - Registry returns string results (always JSON, never exceptions)
AIAgent ↔ ContextCompressor:
- Agent tracks token usage via
normalize_usage() - Compressor checks
should_compress()after each tool execution - Compression preserves first/last turns, summarizes middle via auxiliary LLM
AIAgent ↔ MemoryStore:
- Memory loaded as frozen snapshot at session start (injected into system prompt)
- Writes go to disk immediately but DON'T update the running system prompt
- This preserves Anthropic's prefix cache across the entire session
Gateway ↔ AIAgent:
- Gateway creates AIAgent instances per session
- Sets message handler callback on platform adapters
- Manages session lifecycle (creation, expiry, memory flush)
- Streams tokens via
GatewayStreamConsumerprogressive editing
Gateway ↔ Platform Adapters:
- All adapters inherit
BasePlatformAdapter - Uniform interface:
connect(),send(),handle_message() - Platform-specific overrides for media (images, voice, documents)
Directory Structure (Key Directories)
hermes-agent/
├── run_agent.py # AIAgent class - the brain (11.5K lines)
├── cli.py # CLI/TUI interface (10K lines)
├── model_tools.py # Tool discovery and async bridging
├── toolsets.py # Toolset definitions and resolution
├── toolset_distributions.py # Probability-weighted toolset sampling (research)
├── hermes_state.py # SessionDB (SQLite + FTS5)
├── hermes_constants.py # Import-safe constants, path resolution
├── hermes_logging.py # Centralized logging with redaction
├── trajectory_compressor.py # Context compression engine
├── batch_runner.py # Parallel batch trajectory generation
├── mini_swe_runner.py # SWE benchmark runner
├── rl_cli.py # RL training interface
├── mcp_serve.py # MCP server (expose sessions to external tools)
│
├── agent/ # Provider adapters, prompt building
│ ├── prompt_builder.py # System prompt composition
│ ├── anthropic_adapter.py # Native Anthropic/Claude support
│ ├── bedrock_adapter.py # AWS Bedrock adapter
│ ├── auxiliary_client.py # Side-task LLM routing
│ ├── context_compressor.py # Proactive context management
│ ├── memory_manager.py # External memory provider bridge
│ └── skill_utils.py # Skill matching and filtering
│
├── tools/ # 40+ tools with self-registration
│ ├── registry.py # Singleton tool registry
│ ├── terminal_tool.py # Command execution (6 backends)
│ ├── file_tools.py # Read/write/patch/search files
│ ├── browser_tool.py # Browser automation
│ ├── code_execution_tool.py# Python sandbox with RPC
│ ├── delegate_tool.py # Subagent spawning
│ ├── memory_tool.py # Persistent memory (MEMORY.md, USER.md)
│ ├── mcp_tool.py # MCP client (connect to external MCP servers)
│ ├── skills_tool.py # Skill listing and viewing
│ ├── skill_manager_tool.py # Skill CRUD operations
│ ├── approval.py # Dangerous command approval
│ ├── tirith_security.py # Content-level threat scanning
│ └── environments/ # Terminal backend abstraction
│ ├── base.py # Abstract base
│ ├── local.py # Direct host execution
│ ├── docker.py # Docker/Podman containers
│ ├── modal.py # Modal serverless
│ ├── ssh.py # SSH remote
│ ├── daytona.py # Daytona cloud
│ └── singularity.py # HPC containers
│
├── gateway/ # Multi-platform messaging gateway
│ ├── run.py # GatewayRunner (9.8K lines)
│ ├── config.py # Gateway configuration
│ ├── session.py # Session management
│ ├── delivery.py # Output routing (cron → platform)
│ ├── stream_consumer.py # Progressive message editing
│ └── platforms/ # 26+ platform adapters
│ ├── base.py # Abstract adapter (2.1K lines)
│ ├── telegram.py # Telegram (long-poll/webhook)
│ ├── discord.py # Discord (WebSocket + voice)
│ ├── slack.py # Slack (Bolt + Socket Mode)
│ ├── whatsapp.py # WhatsApp (Baileys Node bridge)
│ ├── signal.py # Signal
│ ├── matrix.py # Matrix/Element (E2E encryption)
│ └── ... # 20+ more adapters
│
├── skills/ # Built-in skill library (28 categories)
├── optional-skills/ # Official but opt-in skills (16 categories)
├── plugins/ # Plugin system (memory providers)
├── hermes_cli/ # CLI utilities, auth, config management
├── cron/ # Scheduled task system
├── acp_adapter/ # Agent Communication Protocol
├── tests/ # Test suite (49 directories)
├── scripts/ # Install, migration, utility scripts
├── web/ # Web interface
└── docs/ # Documentation
State Transitions
SESSION LIFECYCLE
┌──────────┐ user msg ┌──────────┐
│ IDLE │──────────────────▶│ ACTIVE │
│ │ │ │
│ No agent │ │ AIAgent │
│ running │◀──────────────────│ running │
└──────────┘ response done └─────┬────┘
│ │
│ idle_timeout │ context overflow
│ or /reset │ or threshold
▼ ▼
┌──────────┐ ┌──────────┐
│ EXPIRED │ │COMPRES- │
│ │ │SING │
│ Memory │ │ │
│ flushed │ │ Middle │
│ Session │ │ turns │
│ archived │ │ summar- │
└──────────┘ │ ized │
└─────┬────┘
│
▼
┌──────────┐
│ ACTIVE │
│(resumed) │
│ │
│ Shorter │
│ context │
└──────────┘
AGENT TURN LIFECYCLE
┌───────────┐
│ PREPARE │ Build system prompt, inject memory/plugins
└─────┬─────┘
│
▼
┌───────────┐ retry (max 3)
│ API CALL │◀───────────────────┐
│ │ │
│ Streaming │──── error ──────────┤
│ response │ │
└─────┬─────┘ ┌──────────┐ │
│ │ RECOVER │───┘
│ │ │
│ │ compress │
│ │ fallback │
│ │ refresh │
│ │ backoff │
│ └──────────┘
▼
┌───────────┐
│ PARSE │ Normalize response across providers
└─────┬─────┘
│
├── has tool_calls ──▶ EXECUTE TOOLS ──▶ back to API CALL
│
└── no tool_calls ──▶ DELIVER RESPONSE ──▶ PERSIST SESSION