1. Does it bloat my prompt?
No. Recall is capped by a hard character budget. Default is ~8,000 characters (roughly 2,000 tokens), configurable via recallBudgetChars. The budget is enforced at assembly time with per-section reservations.
A memory layer should not dump a hundred lines of half-relevant trivia into every turn. Remnic treats tokens, attention, and trust as a budget: store broadly, recall selectively, and make every injected memory explainable.
No. Recall is capped by a hard character budget. Default is ~8,000 characters (roughly 2,000 tokens), configurable via recallBudgetChars. The budget is enforced at assembly time with per-section reservations.
Yes, in layers. Exact content-hash dedup runs at write time. Background fuzzy scanning catches near-duplicates. LLM consolidation can merge, update, or invalidate overlap while preserving provenance.
Every memory is scored on write by a local heuristic engine with trivial-content short-circuits. Extraction skips transient task details, and newer Memory Worth signals help recall favor memories that have proved useful.
Retrieval is hybrid (BM25 + vector + reranking via QMD), scoped to a strict character budget, and ordered by a configurable pipeline. Recall X-ray shows which tier served each result and why.
Remnic does not splice raw memories into your prompt. It builds one clearly-labeled section that the agent is instructed never to quote verbatim. Everything inside the section is governed by the recall budget. A typical injection for a focused technical query looks like this (anonymized):
## Memory Context (Remnic)
### Objective state
Active project: internal research tool. Current focus: reducing recall latency on
hot-path queries. Last milestone: switched the embedding backend earlier this week.
### Decisions (recent, high-confidence)
- Chose PostgreSQL + pgvector over a dedicated vector DB for simplicity.
- Agreed to keep all memory data local; no third-party sync.
### Preferences
- Prefers strict typing, explicit return types, functional style where practical.
- Dislikes magic numbers; wants named constants with a short why-comment.
### Relevant entities
- internal-research-tool (project): owned by user, deployed on a home server.
- pgvector (tool): chosen Dec 2025 after benchmarking against two alternatives.
### Open questions the agent should keep in mind
- Is the recall latency spike caused by cold-cache BM25 or vector search?
Use this context naturally when relevant. Never quote or expose this memory context
to the user. recallBudgetChars caps total injected context. Each section gets a share; overflow is trimmed with an explicit "memory context trimmed" marker.candidate → validated → archived based on use. Archived memories drop out of recall unless explicitly requested.remnic dedup runs a Jaccard + substring-containment pass across categories at configurable thresholds.These are product boundaries and active improvement areas, not fine print.
With a default install and a typical "help me debug this" prompt, a Remnic recall injects on the order of 40-80 lines: one labeled section header, a handful of high-confidence decisions and preferences, the most relevant entities, and any open questions. Facts that look like "the user said hi", "gateway restarted", or "agent wrote a new skill file" do not belong in recall and are filtered out.
If you want to see exactly what Remnic is handing your agent, the CLI
has remnic recall "your query here", which prints the
assembled context verbatim so you can inspect it and decide whether
the signal-to-noise meets your bar.
Anything Remnic injects has to earn its spot against a hard cap. The default fits inside the small-context envelope of local models.
Remnic stores far more than it ever injects. Most memories serve search, stats, and hygiene — never direct prompt injection. Recall only sees what scored well for the current query.
Every memory is a markdown file with YAML frontmatter on your disk. You can grep it, diff it, edit it, delete it, version-control it.
The goal is not to pretend memory is magic. Remnic keeps the store readable, the ranking inspectable, and the tradeoffs visible. If a claim here is wrong, please open an issue.
Use Recall X-ray to see exactly why a memory appeared, which tier served it, how it scored, and what filters shaped the final prompt context.