AI News

Hermes Agent: How Nous Research Built a Learning Agent That Actually Learns.

Nous Research's Hermes Agent has hit 22,000 GitHub stars with a novel approach to agent memory and learning. We review the architecture and its implications for the future of autonomous systems.

Daniel Fleuren2026-06-0810 min readFounders and operatorsUpdated 2026-06-19

Written by

Daniel Fleuren

Founder, AI Kick Start. 20+ years enterprise IT

Updated 2026-06-19

AI Kick Start editorial image for Hermes Agent: How Nous Research Built a Learning Agent That Actually Learns.

Decision

Test

Treat this as an answer-visibility experiment: tighten entity facts, publish proof, then sample real AI answers monthly.

Risk to watch

Vanity visibility

Do not count a citation as success unless the answer is accurate and connected to qualified enquiries.

Proof to collect

Citation log

Track priority questions, cited sources, answer accuracy, competitors named, and the page that earned the mention.

TL;DR

TL;DR: Hermes Agent from Nous Research has drawn a large GitHub following (reportedly 22,000 stars in an early snapshot, though the real count has since climbed well past that) by chasing one of the harder problems in agent development: getting an agent to keep learning. Most agents start cold every session. Hermes is built to hold onto what it picks up and adjust how it works based on past runs. The pitch is an agent that gets better over time without you having to retrain it.

Key takeaways

Hermes Agent has accumulated 22,000 GitHub stars since its release (Source: GitHub, 2026, note: this is a stale/early figure; the real [NousResearch/hermes-agent](https://github.com/NousResearch/hermes-agent) star count is now considerably higher)
The reported three-layer memory architecture is meant to enable persistent learning across sessions (Source: Nous Research, 2026, the real project is documented as a simpler file-backed memory system, so the three-layer design is unconfirmed)
Hermes reportedly shows 34% improvement in completion time across repeated similar tasks (Source: Nous Research evaluation, 2026, unverified)
The system reportedly demonstrates 8% positive transfer between related task domains (Source: Nous Research, 2026, unverified)

Analysis

Picture a new contractor who turns up to your office every morning with total amnesia. They're competent, they'll do whatever you ask, but they remember nothing from yesterday. You explain your filing system again. You re-state your preferences again. They make the same wrong assumption they made last week, because for them it's the first time. That is roughly how today's AI agents behave.

Nous Research, the open-source AI group behind the Hermes models, put out a tool in February 2026 aimed squarely at that gap. It's called Hermes Agent, and it's an MIT-licensed, open-source agent designed to carry memory forward between sessions instead of wiping the slate each time (NousResearch/hermes-agent on GitHub). The project picked up a serious GitHub audience fast, which tells you the problem it's poking at is one a lot of developers feel.

For an Australian business, the "so what" is straightforward. A lot of the friction in using AI tools is the repetition: re-teaching the same context, re-correcting the same mistakes. An agent that genuinely remembers your patterns is worth more than one that's marginally smarter on day one. Whether Hermes delivers on that promise in practice is the open question, and it's worth looking at how the thing is actually built before getting too excited.

Most AI agents share a basic flaw: they don't learn. Start a fresh session with a coding agent, a research assistant, or a task bot and it begins from the same base state every time. It doesn't recall what worked last time. It doesn't pick up your preferences. It doesn't get sharper with practice. That's the problem Nous Research set out to tackle with Hermes Agent, and the size of the project's GitHub following suggests they've hit a nerve.

Worth a flag up front: the public write-up this article draws on describes Hermes through an architecture that Nous reportedly calls "episodic memory with structured generalisation," built on three memory layers. Independent analysis of the actual project paints a simpler picture, so treat the layered design below as the claimed model rather than confirmed internals. As described, the system keeps three separate memory layers: a short-term working memory for the task in front of it, a medium-term episodic memory that holds successful and failed strategies from earlier sessions, and a long-term semantic memory that pulls general principles out of those past experiences.

The Three-Layer Memory Architecture

The working memory layer is said to behave much like a standard agent context window, holding the running conversation and the tool outputs that matter for the current job. Where Hermes is described as differing is in how it handles that working memory. Instead of just chopping off old context when it hits the limit, the reported design uses a learned compression model to squeeze less-relevant context into short summaries, which then get promoted up into episodic memory. (Note: independent reviews describe the real project's memory as a simpler two-file, character-capped setup plus full-text session search, not a learned-compression pipeline, so this layer should be read as the claimed mechanism.)

The episodic memory is, on this account, where the actual learning is meant to happen. After each task, the system reportedly runs an automatic post-mortem: which strategies did it try, which worked, which failed and why, and were there any surprises along the way. That review supposedly produces structured "experience records" stored in a vector database with detailed metadata tags. (This vector-database-and-learned-ranking description is not corroborated by independent analysis of the project, which points to Markdown skill files and full-text session search instead, so take the specifics as unconfirmed.)

When a new task starts, Hermes is described as searching this episodic memory for relevant past runs. The retrieval is said to go past plain semantic similarity, using a learned ranking model that weighs task type, domain, difficulty, and outcome to surface the best precedents. The claimed payoff: a developer who favours certain coding patterns finds that, over time, Hermes leans into those patterns without being told to.

The semantic memory layer is described as distilling general principles from the pile of episodic records. These take the form of structured rules, the kind of thing a senior engineer says out loud: "when you hit an unfamiliar API, read the official docs before guessing," or "when a test fails on and off, suspect a race condition before a logic bug." On this point the picture is closer to reality: Hermes does produce human-readable, user-editable memory and skill files you can inspect, change, or switch off, which gives a level of transparency that purely neural systems lack (analysis of the Hermes memory system). The separate "semantic memory tier" framing, though, is part of the layered model that independent reviews don't confirm.

Supporting AI Kick Start editorial image for hermes-agent-nous-research-learning-agent-22k-stars. — Generated AI Kick Start editorial visual used to explain the article's practical workflow and trade-offs.

Evaluation Results

Nous Research has reportedly published evaluation data for Hermes Agent, and the figures quoted are encouraging, though they should be treated with caution: no public Nous benchmark could be matched to the specific numbers below. On a custom benchmark said to measure task-completion efficiency across repeated similar tasks, Hermes reportedly shows a 34% improvement in completion time and a 28% drop in error rate between the first and tenth time it sees a task type (Source: Nous Research evaluation, 2026, unverified; the only documented Nous figure is roughly 40% faster completion once an agent has built up 20+ of its own skills).

The more interesting claim is positive transfer: skills picked up in one domain reportedly lift performance in related ones. An agent that's done a lot of web scraping is said to do better on API integration work, presumably because both lean on reading structured data and handling authentication. The reported transfer effect is modest, averaging around 8% improvement in related domains (Source: Nous Research, 2026, unverified; this transfer figure was not found in any published source).

Limitations and Concerns

Hermes has trade-offs. The memory system adds real overhead. Every query reportedly has to pull from the episodic and semantic stores, which is said to add 200-500ms of latency versus a stateless agent (unverified, no published benchmark supports this figure). Storage also grows over time, and Nous hasn't published a clear strategy for consolidating or forgetting old memories. Left alone, a long-running Hermes deployment could pile up gigabytes of experience records that are worth less and less.

There's a safety angle too. An agent that learns from what it sees can also learn bad habits if it's fed malicious input. Nous has reportedly added a moderation layer that flags potentially harmful learned behaviours for a human to review, but that safeguard is unconfirmed and hasn't been independently tested.

Source trail

Primary references to keep this briefing grounded

AI and automation information changes quickly. Use these official or primary references to verify the claims, pricing, product behaviour, and compliance details before committing budget or production data.

What to do next

Audit where your business is already visible in search and AI answers.
Strengthen entity facts, service pages, reviews, and source-worthy content.
Measure citations, qualified enquiries, and conversion, not just traffic.

Want help applying this? Explore Generative Engine Optimisation services.

AI Kick Start is an Illawarra-based AI studio in Figtree, helping businesses across Wollongong, Shellharbour and Kiama and right across Australia put AI to work.

Explore with AI

Use the article as a decision prompt

Summarise this AI Kick Start article for an Australian business owner. Focus on the useful decision, the risks, and the first practical next step: Hermes Agent: How Nous Research Built a Learning Agent That Actually Learns

Read with ChatGPT Open Claude Search with AI Mode

Turn this into a practical roadmap.

Use the guide as a starting point, then map the first workflow worth building.

Book an AI strategy call