Hermes Agent v0.13.0: The First AI Agent Built to Actually Finish What It Starts
295 community contributors shipped 864 commits to make Hermes agents actually finish what they start — May 7, 2026
- Hermes Agent v0.13.0 ships a durable Kanban board with heartbeat monitoring, zombie detection, and hallucination recovery — the first open-source agent framework built specifically to stop silent task failures.
- The /goal command locks an agent on a target across every turn in a session, preventing the context drift that causes long-running agents to abandon work mid-task.
- 8 critical P0 security vulnerabilities closed in this release. Secret redaction is now on by default — it was opt-in before. If you are running a pre-v0.13 deployment, update now.
- MIT licensed and free. Runs on minimal hardware. Install it as a reliability layer alongside Claude Code or Cursor — not as a replacement.
Most AI agents fail silently. You assign a task, step away for an hour, come back to a success message, and the work was never done — or was done wrong in a way that creates more cleanup than the original task required. The most common concern among stack operators who have evaluated Hermes before this release was the reliability of the framework itself — the risk that a new orchestration layer becomes its own single point of failure in the workflows it was meant to protect. Hermes Agent v0.13.0, released May 7, 2026 by Nous Research, is the release that addresses that concern directly. The name is not marketing. It is a design statement.
What Hermes Agent v0.13.0 Actually Fixes
The reliability problem in autonomous AI agents is not a capability problem. Most modern agents — Claude Code, Codex, Cursor — can complete individual tasks competently. The failure point is persistence: agents drop tasks when sessions close, stall when handoffs fail, and report success on work they hallucinated rather than completed. Hermes v0.13.0 is the first release that treats durability as a first-class feature.
The v0.13.0 changelog — 864 commits, 588 merged PRs, 295 community contributors — is dominated by infrastructure changes rather than new capabilities. That is the signal. This is a reliability release, not a features release. The Nous Research team spent a full development cycle solving the problems that make operators distrust AI agents in production environments.
| Problem | Before v0.13.0 | After v0.13.0 | Mechanism |
|---|---|---|---|
| Task orphaned after agent crash | Silent — board freezes | Auto-reclaimed within heartbeat window | Kanban heartbeat + reclaim |
| Agent stops responding mid-task | No detection — zombie lingers | Detected and purged automatically | Zombie detection (darwin + cross-platform) |
| Agent claims task complete — it is not | Marked done. Work lost. | Gate fires. Task flagged for retry or review. | Hallucination recovery gate |
| Agent forgets goal in long session | Context compression kills focus | Re-anchored on every turn | /goal Ralph loop |
| Session lost after restart | Manual recovery required | Auto-resumes on gateway restart | Gateway auto-resume |
| Disk bloat from old checkpoints | Orphan repos accumulate | Real pruning with disk guardrails | Checkpoints v2 |
| Secrets exposed in debug logs | Raw upload before redaction | Redaction runs before upload | P0 security fix PR #19318 |
The Four Reliability Features Operators Need to Understand
Each of the four core v0.13.0 features addresses a specific failure mode that operators running production agent workflows encounter. Understanding what each one actually does — and what it does not do — matters before you add Hermes to a live stack.
Durable Kanban with heartbeat and zombie detection. The Kanban board is a SQLite-backed task queue where multiple Hermes workers share a single source of truth. Every worker emits heartbeats at a configurable interval. If a heartbeat goes missing, the board reclaims the task and re-queues it. Workers that exit without completing tasks are auto-blocked from picking up new work. The hallucination recovery gate — the newest addition — fires when a worker claims completion but the board cannot verify it. Rather than silently marking the task done, it flags the task for human review or retry. Per-task retry budgets and a unified failure counter across spawn, timeout, and crash outcomes prevent infinite retry loops from consuming your agent budget.
The /goal command (Ralph loop). You issue a goal once. On every subsequent turn, a lightweight judge model evaluates whether the agent is still working toward that goal. If drift is detected, the agent is re-anchored. This addresses the most common complaint about long-running agents: context compression causes the model to start responding to incidental messages or tool outputs rather than progressing toward the original objective. The turn budget for the Ralph loop is configurable — if you have a goal that requires exploratory responses, you can widen the tolerance before re-anchoring fires.
Checkpoints v2 with /rollback. The prior checkpoint system created orphan shadow repositories with no pruning mechanism. Long-running Hermes deployments would accumulate gigabytes of orphan state over weeks. V2 is a full rewrite: single-store architecture, real pruning, disk guardrails, and a /rollback command that actually works. For operators running Hermes on a $5 VPS, the disk management improvement is significant.
Gateway auto-resume. If the gateway restarts — whether from a system update, a source-file reload, or a crash — conversations automatically resume when it comes back online. Previously, any mid-session restart required manual recovery. For operators running cron-triggered workflows overnight, this eliminates the silent failure mode where a restart at 3am orphans every active session.
The name Tenacity is not marketing. It is a design statement. Every major feature in v0.13.0 — Kanban durability, zombie detection, /goal persistence, auto-resume — addresses the same root problem: AI agents that start work but do not finish it. Nous Research spent a full release cycle on infrastructure instead of features. That is a signal worth taking seriously.
— FutureAIStack editorialThe 8 Security Fixes Every Operator Needs to Know
The security changes in v0.13.0 are not minor. Four of the eight P0 closures address attack vectors that are actively exploited against developer infrastructure in 2026. The Mini Shai-Hulud npm worm campaign on May 11, 2026 specifically targeted Claude Code configuration credentials and developer tooling environments. Hermes deployments running pre-v0.13 versions are in scope for several of those attack patterns.
The most operationally significant change is that secret redaction is now on by default. In all previous versions, redaction was opt-in — operators had to explicitly enable it to prevent secrets from appearing in session logs and debug outputs. That default was wrong. V0.13.0 corrects it. If you are upgrading from any prior version, verify that your redaction patterns are configured correctly before assuming the default covers your secrets.
| Vulnerability | Severity | What It Allowed | Fixed In |
|---|---|---|---|
| Secret redaction default | P0 | Secrets exposed in session logs and debug outputs | PRs #17691, #21193 |
| Discord guild scope bypass | P0 — CVSS 8.1 | User in Guild A could impersonate allowed roles in Guild B | PR #21241 |
| WhatsApp stranger ingestion | P0 | Unknown numbers could send commands to the agent | PR #21291 |
| MCP OAuth TOCTOU | P0 | Race condition when saving OAuth credentials | PR #21176 |
| auth.json TOCTOU | P0 | Same race class in credential writers | PR #21194 |
| Browser SSRF floor | P0 | Cloud metadata endpoints accessible via browser tool | PR #21228 |
| Debug share raw upload | P0 | Log files uploaded before redaction ran | PR #19318 |
| Cron prompt injection | P0 | Skill content could inject commands into cron prompts | PR #21350 |
When Does Hermes v0.13.0 Actually Pay Off for Your Stack
Hermes is not the right tool for every operator. The overhead of spinning up a self-hosted multi-agent framework — configuring providers, managing profiles, deploying on a VPS — is only justified when the workflows you are running are long enough, complex enough, or failure-sensitive enough that silent agent failures are costing you real time or money. The threshold is lower than most people assume.
If you are running a single-session coding workflow — asking Claude Code to fix a bug, review a PR, or refactor a function — Hermes adds no value. Claude Code handles that natively and reliably. The value of Hermes appears when you are running tasks that span multiple turns, involve multiple agents handing off to each other, need to survive session interruptions, or require an audit trail of what the agent actually did. A content pipeline that runs overnight, a data processing job that spans multiple model calls, a monitoring workflow that fires on schedule and needs to complete reliably — these are Hermes use cases. For those workflows, the time saved by eliminating silent failures and manual session recovery is measurable from day one.
The economics are clear at the free tier. Hermes runs on a $5 VPS. MIT license means no seat costs, no usage limits at the framework level. Your only costs are the underlying LLM provider charges — the same charges you would pay running those workflows in any other tool. The difference is that v0.13.0 eliminates the cost of failure: the time spent recovering from orphaned sessions, the work recreated after silent drops, the debugging required to figure out why an agent reported success on a task it never completed. For the operator running 10 or more agentic workflows per week, that recovery overhead is where the real cost sits. For a deeper look at how to evaluate AI tool costs against actual productivity gains, read our guide on GitHub Copilot’s shift to usage-based billing — the same cost discipline applies here.
Claude Code vs Hermes Agent: Which One Does Your Stack Actually Need
The most common question operators ask when evaluating Hermes is whether it replaces their existing Claude Code or Cursor setup. It does not. The two tools operate at different layers of the stack and serve genuinely different use cases. Understanding the boundary between them prevents both underuse and overengineering.
Claude Code handles single-session coding tasks with precision — bug fixes, refactors, PR reviews, code generation within a defined scope. Hermes handles orchestration of work that outlasts a single session, involves multiple agents handing off to each other, or needs to survive system restarts and recover from failures automatically. The value proposition only appears when your workflows cross those thresholds.
| Task or workflow | Claude Code | Hermes Agent | Both |
|---|---|---|---|
| Fix a bug in a single file | Best choice | ||
| Review a pull request | Best choice | ||
| Long-running overnight task | Required | ||
| Multi-agent task handoffs | Required | ||
| Session survives system restart | Required | ||
| Cron-triggered scheduled workflows | Best choice | ||
| Cross-platform messaging integrations | Best choice | ||
| Autonomous skill improvement over time | Best choice | ||
| Reading repo context + writing code | Complementary | ||
| Monitoring agent state during task | Hermes exposes MCP to Claude Code |
Hermes Agent v0.13.0 is the first open-source agent release that treats reliability as a first-class feature rather than an afterthought. If you are running multi-step agentic workflows and losing work to silent failures, the Tenacity Release is worth deploying this week — not eventually. Install it as the reliability layer next to Claude Code or Cursor, not as a replacement. Free, MIT licensed, and built by 295 contributors who clearly got tired of watching agents fail silently. Start with the AlphaSignal weekend install guide at our Claude Code rate limits overview for the provider context you will need.
Frequently Asked Questions
Hermes Agent v0.13.0, released May 7, 2026 by Nous Research, is an open-source autonomous AI agent framework. The v0.13.0 Tenacity Release introduced durable multi-agent Kanban with zombie detection, the /goal command for persistent task focus, Checkpoints v2 for state persistence, and 8 critical security fixes. It is MIT licensed and free to self-host.
Zombie detection in Hermes Agent v0.13.0 identifies agent workers that have stopped responding but have not cleanly exited. When a zombie is detected, the Kanban board automatically reclaims its assigned tasks, re-queues them for a healthy worker, and blocks the zombie worker from picking up new tasks until it is restarted.
The /goal command locks an agent onto a specific objective across every turn in a session. Using the Ralph loop mechanism, the agent re-anchors to the stated goal each turn, preventing conversational drift or context compression from causing it to abandon the original task. The turn budget is configurable via PR #21287.
Yes. Hermes Agent is MIT licensed and completely free. It is self-hosted and designed to run on minimal hardware including sub-$10 VPS deployments. The Docker image runs in resource-constrained environments. There are no subscription fees, usage limits, or API costs for the framework itself — only the underlying LLM provider costs apply.
Known limitations include: macOS Python 3.13 conflicts requiring Python 3.11 or 3.12; no native Windows support (WSL2 required); /goal judge model can misfire on ambiguous goals; 35 total vulnerabilities were identified in April 2026 with 8 P0s closed in this release but not all 35 resolved; and DeepSeek V4-Pro cost tracking is broken in session analytics.
Hermes Agent and Claude Code serve different roles. Claude Code is a coding-focused agent optimized for single-session development tasks. Hermes Agent is a multi-platform, multi-agent orchestration framework designed for long-running autonomous workflows across 20 messaging platforms. They are complementary — Hermes exposes itself as an MCP server that Claude Code can read.
Hermes Agent v0.13.0 closed 8 P0-level security vulnerabilities: secret redaction now ON by default, Discord role-allowlists scoped to guild level, WhatsApp rejecting unknown numbers by default, MCP OAuth TOCTOU race condition closed, auth.json TOCTOU race closed, browser SSRF floor blocking cloud metadata endpoints, debug share logs now redacted before upload, and cron prompts scanned for injection attacks.
Hermes Agent v0.13.0 supports 20 messaging platforms. Google Chat was added as the 20th platform in this release. Other supported platforms include Slack, Discord, Telegram, WhatsApp, iMessage, WeChat, Matrix, QQ, and more. The v0.13.0 release also introduced a generic plugin surface so third-party platform adapters can be added without modifying core code.
The Hermes Agent Kanban is a durable SQLite-backed task board where multiple AI workers share a single source of truth. Each worker emits heartbeats; missed heartbeats trigger automatic task reclaim. Workers that exit without completing tasks are auto-blocked. A hallucination recovery gate verifies completion claims before marking tasks done, preventing false completions from stalling the queue.
Checkpoints v2 in Hermes Agent v0.13.0 is a full rewrite of the prior system. The old implementation created orphan shadow repositories and accumulated disk bloat with no pruning mechanism. V2 uses a single-store architecture with real pruning, disk guardrails, and a /rollback command. It eliminates the orphan state accumulation that caused long-running deployments to consume excessive disk space.
Hermes Agent v0.13.0 does not support native Windows installations. Windows users must use WSL2. The v0.13.0 release significantly expanded the WSL2 installation guide to make this path more accessible, and native Windows support is listed as a roadmap item. Mac users on Python 3.13 should downgrade to Python 3.11 or 3.12 due to dependency conflicts.
The AI tools that actually move your business forward — delivered weekly. No fluff. No hype.
Get weekly updates