AI Tools & Stacks

Hermes Agent v0.13.0: The First AI Agent Built to Actually Finish What It Starts

May 12, 2026 · 19 min read

295 community contributors shipped 864 commits to make Hermes agents actually finish what they start — May 7, 2026

Community 295 Contributors in one release. 864 commits. 588 merged PRs.

Issues Closed 282 13 P0, 36 P1 — reliability and security focused

Security Fixes 8 P0 Critical vulnerabilities closed. Redaction now ON by default.

License Free MIT license. Runs on a $5 VPS. No usage limits.

TL;DR — what you need to know

Hermes Agent v0.13.0 ships a durable Kanban board with heartbeat monitoring, zombie detection, and hallucination recovery — the first open-source agent framework built specifically to stop silent task failures.
The /goal command locks an agent on a target across every turn in a session, preventing the context drift that causes long-running agents to abandon work mid-task.
8 critical P0 security vulnerabilities closed in this release. Secret redaction is now on by default — it was opt-in before. If you are running a pre-v0.13 deployment, update now.
MIT licensed and free. Runs on minimal hardware. Install it as a reliability layer alongside Claude Code or Cursor — not as a replacement.

Most AI agents fail silently. You assign a task, step away for an hour, come back to a success message, and the work was never done — or was done wrong in a way that creates more cleanup than the original task required. The most common concern among stack operators who have evaluated Hermes before this release was the reliability of the framework itself — the risk that a new orchestration layer becomes its own single point of failure in the workflows it was meant to protect. Hermes Agent v0.13.0, released May 7, 2026 by Nous Research, is the release that addresses that concern directly. The name is not marketing. It is a design statement.

What Hermes Agent v0.13.0 Actually Fixes

The reliability problem in autonomous AI agents is not a capability problem. Most modern agents — Claude Code, Codex, Cursor — can complete individual tasks competently. The failure point is persistence: agents drop tasks when sessions close, stall when handoffs fail, and report success on work they hallucinated rather than completed. Hermes v0.13.0 is the first release that treats durability as a first-class feature.

Running 10 agentic workflows per week, one silent failure and recovery session averages 45 minutes at a conservative estimate. At a $150/hour operator rate — that is $112.50 per failure event, or $1,125/month if failures occur weekly. Hermes v0.13.0 is built to eliminate the infrastructure that causes those failures.

AI agent traffic grew 7,851% year-over-year in 2025, according to HUMAN Security’s 2026 Bad Bot Report. 51% of all internet traffic is now automated. The operators building on agent infrastructure today are doing so in an environment where reliability failures are no longer edge cases — they are operating conditions.

The v0.13.0 changelog — 864 commits, 588 merged PRs, 295 community contributors — is dominated by infrastructure changes rather than new capabilities. That is the signal. This is a reliability release, not a features release. The Nous Research team spent a full development cycle solving the problems that make operators distrust AI agents in production environments.

Problem	Before v0.13.0	After v0.13.0	Mechanism
Task orphaned after agent crash	Silent — board freezes	Auto-reclaimed within heartbeat window	Kanban heartbeat + reclaim
Agent stops responding mid-task	No detection — zombie lingers	Detected and purged automatically	Zombie detection (darwin + cross-platform)
Agent claims task complete — it is not	Marked done. Work lost.	Gate fires. Task flagged for retry or review.	Hallucination recovery gate
Agent forgets goal in long session	Context compression kills focus	Re-anchored on every turn	/goal Ralph loop
Session lost after restart	Manual recovery required	Auto-resumes on gateway restart	Gateway auto-resume
Disk bloat from old checkpoints	Orphan repos accumulate	Real pruning with disk guardrails	Checkpoints v2
Secrets exposed in debug logs	Raw upload before redaction	Redaction runs before upload	P0 security fix PR #19318

The Four Reliability Features Operators Need to Understand

Each of the four core v0.13.0 features addresses a specific failure mode that operators running production agent workflows encounter. Understanding what each one actually does — and what it does not do — matters before you add Hermes to a live stack.

Durable Kanban with heartbeat and zombie detection. The Kanban board is a SQLite-backed task queue where multiple Hermes workers share a single source of truth. Every worker emits heartbeats at a configurable interval. If a heartbeat goes missing, the board reclaims the task and re-queues it. Workers that exit without completing tasks are auto-blocked from picking up new work. The hallucination recovery gate — the newest addition — fires when a worker claims completion but the board cannot verify it. Rather than silently marking the task done, it flags the task for human review or retry. Per-task retry budgets and a unified failure counter across spawn, timeout, and crash outcomes prevent infinite retry loops from consuming your agent budget.

The /goal command (Ralph loop). You issue a goal once. On every subsequent turn, a lightweight judge model evaluates whether the agent is still working toward that goal. If drift is detected, the agent is re-anchored. This addresses the most common complaint about long-running agents: context compression causes the model to start responding to incidental messages or tool outputs rather than progressing toward the original objective. The turn budget for the Ralph loop is configurable — if you have a goal that requires exploratory responses, you can widen the tolerance before re-anchoring fires.

Checkpoints v2 with /rollback. The prior checkpoint system created orphan shadow repositories with no pruning mechanism. Long-running Hermes deployments would accumulate gigabytes of orphan state over weeks. V2 is a full rewrite: single-store architecture, real pruning, disk guardrails, and a /rollback command that actually works. For operators running Hermes on a $5 VPS, the disk management improvement is significant.

Gateway auto-resume. If the gateway restarts — whether from a system update, a source-file reload, or a crash — conversations automatically resume when it comes back online. Previously, any mid-session restart required manual recovery. For operators running cron-triggered workflows overnight, this eliminates the silent failure mode where a restart at 3am orphans every active session.

35 Known Vulnerabilities

An April 2026 responsible disclosure from the cybersecurity community identified 35 vulnerabilities in the pre-v0.13 codebase, including 18 exploitable chains. V0.13.0 closes the 8 most critical P0s. The remaining 27 have not all been patched. Review your deployment threat model before running Hermes in internet-facing configurations.

/goal Judge Model Misfires

The Ralph loop uses a lightweight judge model to evaluate goal alignment. That judge can misclassify on-topic responses as drift — particularly on ambiguous or open-ended goals — causing premature re-anchoring that interrupts legitimate progress. Test /goal behavior with your specific use case before deploying in production workflows.

macOS and Windows Friction

macOS users on Python 3.13 hit dependency conflicts — use Python 3.11 or 3.12. Native Windows is unsupported; WSL2 is required. The v0.13.0 release expanded the WSL2 guide significantly, but bare Windows installs will fail. Plan accordingly before recommending Hermes to non-technical team members.

Cost Tracking Bug

Sessions run against the newly added DeepSeek V4-Pro model display as “unknown cost” in Hermes session analytics due to a cost-tracking bug introduced with the model addition (GitHub issue #24218). If you are using Hermes to monitor AI spend across a team, exclude DeepSeek V4-Pro sessions from cost reports until this is patched.

The name Tenacity is not marketing. It is a design statement. Every major feature in v0.13.0 — Kanban durability, zombie detection, /goal persistence, auto-resume — addresses the same root problem: AI agents that start work but do not finish it. Nous Research spent a full release cycle on infrastructure instead of features. That is a signal worth taking seriously.

— FutureAIStack editorial

The 8 Security Fixes Every Operator Needs to Know

The security changes in v0.13.0 are not minor. Four of the eight P0 closures address attack vectors that are actively exploited against developer infrastructure in 2026. The Mini Shai-Hulud npm worm campaign on May 11, 2026 specifically targeted Claude Code configuration credentials and developer tooling environments. Hermes deployments running pre-v0.13 versions are in scope for several of those attack patterns.

The most operationally significant change is that secret redaction is now on by default. In all previous versions, redaction was opt-in — operators had to explicitly enable it to prevent secrets from appearing in session logs and debug outputs. That default was wrong. V0.13.0 corrects it. If you are upgrading from any prior version, verify that your redaction patterns are configured correctly before assuming the default covers your secrets.

Vulnerability	Severity	What It Allowed	Fixed In
Secret redaction default	P0	Secrets exposed in session logs and debug outputs	PRs #17691, #21193
Discord guild scope bypass	P0 — CVSS 8.1	User in Guild A could impersonate allowed roles in Guild B	PR #21241
WhatsApp stranger ingestion	P0	Unknown numbers could send commands to the agent	PR #21291
MCP OAuth TOCTOU	P0	Race condition when saving OAuth credentials	PR #21176
auth.json TOCTOU	P0	Same race class in credential writers	PR #21194
Browser SSRF floor	P0	Cloud metadata endpoints accessible via browser tool	PR #21228
Debug share raw upload	P0	Log files uploaded before redaction ran	PR #19318
Cron prompt injection	P0	Skill content could inject commands into cron prompts	PR #21350

When Does Hermes v0.13.0 Actually Pay Off for Your Stack

Hermes is not the right tool for every operator. The overhead of spinning up a self-hosted multi-agent framework — configuring providers, managing profiles, deploying on a VPS — is only justified when the workflows you are running are long enough, complex enough, or failure-sensitive enough that silent agent failures are costing you real time or money. The threshold is lower than most people assume.

If you are running a single-session coding workflow — asking Claude Code to fix a bug, review a PR, or refactor a function — Hermes adds no value. Claude Code handles that natively and reliably. The value of Hermes appears when you are running tasks that span multiple turns, involve multiple agents handing off to each other, need to survive session interruptions, or require an audit trail of what the agent actually did. A content pipeline that runs overnight, a data processing job that spans multiple model calls, a monitoring workflow that fires on schedule and needs to complete reliably — these are Hermes use cases. For those workflows, the time saved by eliminating silent failures and manual session recovery is measurable from day one.

The economics are clear at the free tier. Hermes runs on a $5 VPS. MIT license means no seat costs, no usage limits at the framework level. Your only costs are the underlying LLM provider charges — the same charges you would pay running those workflows in any other tool. The difference is that v0.13.0 eliminates the cost of failure: the time spent recovering from orphaned sessions, the work recreated after silent drops, the debugging required to figure out why an agent reported success on a task it never completed. For the operator running 10 or more agentic workflows per week, that recovery overhead is where the real cost sits. For a deeper look at how to evaluate AI tool costs against actual productivity gains, read our guide on GitHub Copilot’s shift to usage-based billing — the same cost discipline applies here.

Claude Code vs Hermes Agent: Which One Does Your Stack Actually Need

The most common question operators ask when evaluating Hermes is whether it replaces their existing Claude Code or Cursor setup. It does not. The two tools operate at different layers of the stack and serve genuinely different use cases. Understanding the boundary between them prevents both underuse and overengineering.

Claude Code handles single-session coding tasks with precision — bug fixes, refactors, PR reviews, code generation within a defined scope. Hermes handles orchestration of work that outlasts a single session, involves multiple agents handing off to each other, or needs to survive system restarts and recover from failures automatically. The value proposition only appears when your workflows cross those thresholds.

Task or workflow	Claude Code	Hermes Agent	Both
Fix a bug in a single file	Best choice
Review a pull request	Best choice
Long-running overnight task		Required
Multi-agent task handoffs		Required
Session survives system restart		Required
Cron-triggered scheduled workflows		Best choice
Cross-platform messaging integrations		Best choice
Autonomous skill improvement over time		Best choice
Reading repo context + writing code			Complementary
Monitoring agent state during task			Hermes exposes MCP to Claude Code

Our verdict

Hermes Agent v0.13.0 is the first open-source agent release that treats reliability as a first-class feature rather than an afterthought. If you are running multi-step agentic workflows and losing work to silent failures, the Tenacity Release is worth deploying this week — not eventually. Install it as the reliability layer next to Claude Code or Cursor, not as a replacement. Free, MIT licensed, and built by 295 contributors who clearly got tired of watching agents fail silently. Start with the AlphaSignal weekend install guide at our Claude Code rate limits overview for the provider context you will need.

Frequently Asked Questions

What is Hermes Agent v0.13.0?

Hermes Agent v0.13.0, released May 7, 2026 by Nous Research, is an open-source autonomous AI agent framework. The v0.13.0 Tenacity Release introduced durable multi-agent Kanban with zombie detection, the /goal command for persistent task focus, Checkpoints v2 for state persistence, and 8 critical security fixes. It is MIT licensed and free to self-host.

What does zombie detection do in Hermes Agent?

Zombie detection in Hermes Agent v0.13.0 identifies agent workers that have stopped responding but have not cleanly exited. When a zombie is detected, the Kanban board automatically reclaims its assigned tasks, re-queues them for a healthy worker, and blocks the zombie worker from picking up new tasks until it is restarted.

What is the /goal command in Hermes Agent v0.13.0?

The /goal command locks an agent onto a specific objective across every turn in a session. Using the Ralph loop mechanism, the agent re-anchors to the stated goal each turn, preventing conversational drift or context compression from causing it to abandon the original task. The turn budget is configurable via PR #21287.

Is Hermes Agent v0.13.0 free to use?

Yes. Hermes Agent is MIT licensed and completely free. It is self-hosted and designed to run on minimal hardware including sub-$10 VPS deployments. The Docker image runs in resource-constrained environments. There are no subscription fees, usage limits, or API costs for the framework itself — only the underlying LLM provider costs apply.

What are the known limitations of Hermes Agent v0.13.0?

Known limitations include: macOS Python 3.13 conflicts requiring Python 3.11 or 3.12; no native Windows support (WSL2 required); /goal judge model can misfire on ambiguous goals; 35 total vulnerabilities were identified in April 2026 with 8 P0s closed in this release but not all 35 resolved; and DeepSeek V4-Pro cost tracking is broken in session analytics.

How does Hermes Agent v0.13.0 compare to Claude Code for operators?

Hermes Agent and Claude Code serve different roles. Claude Code is a coding-focused agent optimized for single-session development tasks. Hermes Agent is a multi-platform, multi-agent orchestration framework designed for long-running autonomous workflows across 20 messaging platforms. They are complementary — Hermes exposes itself as an MCP server that Claude Code can read.

What security vulnerabilities were fixed in Hermes Agent v0.13.0?

Hermes Agent v0.13.0 closed 8 P0-level security vulnerabilities: secret redaction now ON by default, Discord role-allowlists scoped to guild level, WhatsApp rejecting unknown numbers by default, MCP OAuth TOCTOU race condition closed, auth.json TOCTOU race closed, browser SSRF floor blocking cloud metadata endpoints, debug share logs now redacted before upload, and cron prompts scanned for injection attacks.

What platforms does Hermes Agent v0.13.0 support?

Hermes Agent v0.13.0 supports 20 messaging platforms. Google Chat was added as the 20th platform in this release. Other supported platforms include Slack, Discord, Telegram, WhatsApp, iMessage, WeChat, Matrix, QQ, and more. The v0.13.0 release also introduced a generic plugin surface so third-party platform adapters can be added without modifying core code.

How does the Hermes Agent Kanban system work in v0.13.0?

The Hermes Agent Kanban is a durable SQLite-backed task board where multiple AI workers share a single source of truth. Each worker emits heartbeats; missed heartbeats trigger automatic task reclaim. Workers that exit without completing tasks are auto-blocked. A hallucination recovery gate verifies completion claims before marking tasks done, preventing false completions from stalling the queue.

What changed in Checkpoints v2 in Hermes Agent?

Checkpoints v2 in Hermes Agent v0.13.0 is a full rewrite of the prior system. The old implementation created orphan shadow repositories and accumulated disk bloat with no pruning mechanism. V2 uses a single-store architecture with real pruning, disk guardrails, and a /rollback command. It eliminates the orphan state accumulation that caused long-running deployments to consume excessive disk space.

Does Hermes Agent v0.13.0 run on Windows?

Hermes Agent v0.13.0 does not support native Windows installations. Windows users must use WSL2. The v0.13.0 release significantly expanded the WSL2 installation guide to make this path more accessible, and native Windows support is listed as a roadmap item. Mac users on Python 3.13 should downgrade to Python 3.11 or 3.12 due to dependency conflicts.

The AI tools that actually move your business forward — delivered weekly. No fluff. No hype.

Get weekly updates