Hermes Agent - Overview

What It Is

Hermes Agent is a self-improving, multi-platform AI agent built by Nous Research. It provides an LLM-powered assistant that can execute tools (terminal commands, file operations, web browsing, code execution), learn from experience by creating and refining reusable skills, maintain persistent memory across sessions, and communicate over 20+ messaging platforms from a single gateway process.

Version: 0.9.0
License: MIT
Language: Python 3.11+
Repository: NousResearch/hermes-agent

What Problem It Solves

Most AI assistants are stateless chat interfaces locked to a single provider and a single interface. Hermes Agent addresses five gaps simultaneously:

Gap	How Hermes Solves It
No persistence	Memory system (MEMORY.md, USER.md) survives across sessions. Session search (FTS5 + LLM summarization) enables cross-session recall.
No learning	Skills system: agent creates reusable procedural skills from experience, improves them during use, and shares them via Skills Hub.
Single interface	Multi-platform gateway: Telegram, Discord, Slack, WhatsApp, Signal, Matrix, Email, SMS, Home Assistant, and more - all from one process.
Provider lock-in	Works with any OpenAI-compatible endpoint: OpenRouter (200+ models), Anthropic, Google Gemini, Nous Portal, Kimi, MiniMax, Hugging Face, local models via LM Studio/Ollama/vLLM. Switch with `hermes model`.
Laptop-bound	Runs on a $5 VPS, GPU clusters, Docker, Modal (serverless), Daytona, SSH remotes. Talk to it from Telegram while it works on a cloud VM.

Target Users

Power users & developers who want a personal AI agent that learns their preferences, runs unattended cron jobs, and can be reached from any messaging platform.
AI/ML researchers who need batch trajectory generation, RL training environments (Atropos), and trajectory compression for training tool-calling models.
Teams who want a self-hosted, privacy-respecting AI assistant that isn't tied to a specific vendor.

Tech Stack

Core Language & Runtime

Python 3.11+ - Primary language (888 .py files)
asyncio - Gateway and concurrent operations
Threading - Parallel tool execution, subagent delegation

LLM Integration

Dependency	Purpose
`openai>=2.21.0`	Primary API client (OpenAI-compatible endpoints)
`anthropic>=0.39.0`	Native Anthropic API (Claude models with thinking/caching)
`boto3` (optional)	AWS Bedrock Converse API

CLI / TUI

Dependency	Purpose
`prompt_toolkit>=3.0.52`	Interactive terminal UI with multiline editing, autocomplete
`rich>=14.3.3`	Rich text formatting, progress bars, panels
`fire>=0.7.1`	CLI argument parsing (`hermes` command)

Web & HTTP

Dependency	Purpose
`httpx[socks]>=0.28.1`	Async HTTP client with SOCKS proxy support
`requests>=2.33.0`	Sync HTTP client (CVE-pinned)
`exa-py>=2.9.0`	Exa web search API
`firecrawl-py>=4.16.0`	Web scraping/extraction
`parallel-web>=0.4.2`	Parallel web fetching

Data & Config

Dependency	Purpose
`pyyaml>=6.0.2`	YAML config parsing
`python-dotenv>=1.2.1`	.env file loading
`pydantic>=2.12.5`	Data validation
`jinja2>=3.1.5`	Template rendering (skills, prompts)

Messaging Platforms (optional extras)

Extra	Key Dependencies
`messaging`	python-telegram-bot, discord.py, slack-bolt, aiohttp
`matrix`	matrix-nio
`homeassistant`	websockets
`sms`	twilio

Execution Environments

Extra	Key Dependencies
`modal`	modal>=1.0.0 (serverless cloud)
`daytona`	daytona-sdk>=0.148.0 (cloud dev environments)
Docker	Built-in support via subprocess
SSH	Built-in via subprocess
Singularity	Built-in for HPC

Research & Training

Extra	Key Dependencies
`rl`	atroposlib, tinker (git), wandb, fastapi
`yc-bench`	datasets, rouge-score

Other Notable Dependencies

Dependency	Purpose
`tenacity>=9.1.4`	Retry logic with exponential backoff
`PyJWT[crypto]>=2.12.0`	JWT token handling (OAuth, Codex)
`fal-client>=0.13.1`	Image generation via FAL
`edge-tts>=7.2.7`	Free text-to-speech (no API key needed)
`faster-whisper` (optional)	Local speech-to-text

Why These Choices

OpenAI SDK as universal client: The openai library's chat completions format is the de facto standard. Nearly every LLM provider (OpenRouter, Nous, vLLM, llama.cpp, LM Studio) exposes an OpenAI-compatible endpoint. Using it as the primary client means one code path handles 200+ models.
Anthropic SDK for native features: Claude's extended thinking, prompt caching, and interleaved reasoning require the native API. The Anthropic adapter (agent/anthropic_adapter.py) activates automatically for direct Anthropic connections.
prompt_toolkit over alternatives: Provides real terminal UX - multiline editing, syntax highlighting, autocomplete, history - unlike simple input() loops.
SQLite (WAL mode) for sessions: Zero-dependency, file-based, concurrent-read-safe. FTS5 enables cross-session search without an external search engine.
Subprocess-based environments: Docker, SSH, Modal, and Singularity are all driven via subprocess calls rather than client libraries (except Modal/Daytona SDKs). This avoids heavy dependencies and works identically across platforms.

Entry Points

# pyproject.toml
[project.scripts]
hermes = "hermes_cli.main:main"        # Primary CLI
hermes-agent = "run_agent:main"         # Direct agent invocation
hermes-acp = "acp_adapter.entry:main"   # Agent Communication Protocol server

Command	What It Does
`hermes`	Interactive TUI, model selection, tools config, gateway start
`hermes model`	Choose LLM provider and model
`hermes gateway`	Start multi-platform messaging gateway
`hermes tools`	Configure enabled tools
`hermes cron`	Manage scheduled tasks
`hermes doctor`	Diagnose configuration issues
`hermes update`	Self-update

Repository Scale

Metric	Value
Python files	888
Largest file	`run_agent.py` (585 KB, ~11,500 lines)
Second largest	`cli.py` (447 KB, ~10,000 lines)
Gateway runner	`gateway/run.py` (~9,800 lines)
Platform adapters	26+
Built-in tools	40+
Skills categories	28 built-in + 16 optional
Test files	49 directories under `tests/`