Zero agent-side choice
Recall is structural, not tool-based. The MemoryProvider hook fires before every LLM call, regardless of what the model decides.
The remnic-hermes package plugs Remnic directly into
Hermes Agent using the
MemoryProvider protocol. Every LLM call gets relevant memories pre-fetched
into the system prompt. Every response is automatically observed for future
recall. The agent does not choose when to recall — it always does.
Every agent conversation starts from zero unless something deliberately
bridges the gap. The obvious fix is to give the agent a
recall tool and expect it to call the tool before
answering. That approach works until the moment it matters most: when
the context is long, the task is complex, or the model is under token
pressure. At those moments the agent skips the tool call, forgets to
check, or simply does not know what to search for.
MCP-based memory integration — where Remnic registers
remnic_recall as a callable tool — is genuinely useful for
explicit, user-directed queries. But for ambient recall that should
happen on every turn, it places a burden on the model that will not
always be honored.
Hermes Agent defines a MemoryProvider protocol: a set of lifecycle
hooks that run at fixed points in the agent loop, outside the LLM's
control. The remnic-hermes package implements that
protocol and connects it to the Remnic daemon.
The result is structural recall. Before the LLM sees the user's
message, the plugin fires pre_llm_call, queries Remnic
with the last user message, and injects the results directly into the
system prompt as a <remnic-memory> block. After the
LLM responds, sync_turn fires and sends the last two
messages to Remnic for real-time observation. At session end,
extract_memories sends the full transcript for a deeper
extraction pass. The agent does not participate in any of this. It
just gets better context.
| Aspect | MCP only | MemoryProvider |
|---|---|---|
| Recall | Agent must call remnic_recall | Automatic on every turn before the LLM call |
| Observe | Agent must call remnic_store | Automatic after every response |
| Latency | Tool call overhead on the hot path | Pre-fetched; Remnic query runs before LLM call |
| Reliability | Agent may skip under load or context pressure | Structural — the hook cannot be skipped |
| Tool call budget | Recall consumes one tool call per turn | No tool call consumed; memory arrives in system prompt |
The two approaches are complementary. remnic-hermes also
registers remnic_recall, remnic_store, and
remnic_search as explicit tools for cases where the agent
needs to search memory on demand. Structural recall handles the
ambient case; explicit tools handle the intentional case.
Recall is structural, not tool-based. The MemoryProvider hook fires before every LLM call, regardless of what the model decides.
Memory arrives in the system prompt, not via a tool call round-trip. The agent's tool budget is preserved for actual task work.
Remnic stores memories on disk as plain markdown files. They survive Hermes restarts, machine reboots, and profile changes.
The Remnic daemon runs on your machine. No cloud service, no telemetry, no subscription. Your memories are plain files.
Each provider instance generates a unique session_key. Different Hermes profiles can use different keys or share one.
If Remnic is down or unreachable, the plugin swallows errors silently and the agent keeps working without memory context.
remnic_recall, remnic_store, remnic_search registered as Hermes tools for on-demand use.
remnic-hermes and the Remnic core are both MIT. Inspect, fork, and extend freely.
pip install remnic-hermes Requires Python 3.10+. Install into the same environment Hermes uses. Hermes Agent v0.7.0+ is required for the MemoryProvider protocol.
remnic connectors install hermes Generates a dedicated Hermes token, writes it to ~/.remnic/tokens.json, adds the remnic: block to config.yaml, and runs a daemon health check.
Hermes reads its plugin list at startup. Full restart required. Config reload is not sufficient.
hermes --version && pip show remnic-hermes Start a session and issue a query. Check the Hermes debug log for <remnic-memory> blocks.
The plugin reads from a remnic: key in your Hermes
config.yaml. All fields are optional — defaults work for a
standard local Remnic install.
plugins:
- remnic_hermes
remnic:
host: "127.0.0.1" # default
port: 4318 # default
token: "" # empty = auto-load from ~/.remnic/tokens.json
session_key: "" # auto-generated as hermes-<12hex>
timeout: 30.0 REMNIC_HOST and REMNIC_PORT env vars override
the config values. Legacy ENGRAM_HOST /
ENGRAM_PORT are accepted during the transition. A legacy
engram: config block is accepted in place of
remnic: — the plugin reads remnic: first and
falls back to engram:.
remnic daemon status
remnic daemon install # installs and starts the launchd/systemd service Verify ~/.remnic/tokens.json exists and contains a hermes connector entry. Re-running remnic connectors install hermes regenerates the token.
The plugin skips recall when the last user message is fewer than three words. Force an explicit recall to confirm the daemon round-trip works:
remnic daemon status
remnic recall "any query with at least three words" If you see ModuleNotFoundError: No module named 'remnic_hermes', check which Python Hermes is running under and install into that one: which python && pip show remnic-hermes.