WeClone fine-tunes a
model to sound like you. Remnic gives it long-term memory. Together the
avatar remembers what happened yesterday — and still sounds like you
while doing it. Three new packages cover the whole loop: bootstrap
memory from chat history, train on clean distilled data, and serve the
avatar behind a memory-aware proxy.
Each package handles one phase. Install whichever ones match your
workflow — or install all three for the end-to-end loop.
@remnic/import-weclonev1.0.0
Bootstrap memory from chat history
Import WeClone-preprocessed chat exports (Telegram, WhatsApp,
Discord, Slack) directly into Remnic. Skip the wait for organic
memory accumulation — seed your store from years of real
conversations.
Lower trust level on imports, so historical data doesn't drown organic recall
@remnic/export-weclonev1.0.0
Export clean training data
Turn your Remnic memory store into an Alpaca-format training set
for LLaMA
Factory — the pipeline WeClone uses under the hood. Structured
facts and preferences, not noisy raw logs.
Alpaca JSON output WeClone ingests directly
Template-based Q/A synthesis — no extra LLM calls
Style-marker extraction to match your tone
Belt-and-suspenders PII sweep before write
@remnic/connector-weclonev1.0.0
Serve the avatar with memory
An OpenAI-compatible HTTP proxy that sits in front of a deployed
WeClone API server. Every chat completion gets Remnic memory
injected into the system prompt, and every response is observed
back into the store.
Drop-in /v1/chat/completions endpoint
Per-caller isolation via X-Caller-Id or user
Graceful degrade if Remnic is unreachable
Works with Discord bots, Telegram bots, AstrBot, LangBot
The full loop
Each package plugs into a different point of the avatar lifecycle.
Run them together for an avatar that is continuously improved by the
same memory store that powers your coding agents.
UpstreamWeClone API serverPort 8000 · fine-tuned avatar
Quickstart
You can adopt the packages in any order. This is the end-to-end flow
if you have no avatar yet and want the full loop.
1 — Install Remnic and the daemon
npm install -g @remnic/cli
remnic daemon install
2 — (Optional) Seed memory from chat history
If you have an existing chat export, run it through
WeClone's preprocessor
first (it handles platform quirks and redacts PII), then bulk-import
the preprocessed JSON into Remnic:
# Dry run: validate the export end-to-end, report counts
remnic engram bulk-import \
--source weclone \
--file ./preprocessed_telegram.json \
--platform telegram \
--dry-run
# Real import into a named namespace
remnic engram bulk-import \
--source weclone \
--file ./preprocessed_telegram.json \
--platform telegram \
--namespace personal-chat-history
Supported platforms: telegram, whatsapp,
discord, slack. Imported memories are
tagged with trustLevel: "import" so a large historical
backfill doesn't outrank organic memories in recall.
3 — Export memories as WeClone training data
Turn your (now-bootstrapped) Remnic store into an Alpaca-format
dataset. The synthesizer generates conversational Q/A pairs from
structured memories using category-driven templates — no extra LLM
calls.
A belt-and-suspenders PII sweep runs by default on the output file.
Disable only with --no-privacy-sweep if you already have
a compensating control.
4 — Train the avatar with WeClone
Hand the resulting weclone-dataset.json to
WeClone, which
drives LLaMA Factory under the hood. Follow the WeClone docs for the
actual training run — this is outside Remnic's scope.
This mints a dedicated Remnic auth token, writes a proxy config to
~/.remnic/connectors/weclone.json (owner-only
permissions, since it contains the bearer token), and registers the
connector so remnic connectors list / doctor can see it.
6 — Start the proxy
remnic-weclone-proxy
Point your Discord bot, Telegram bot, AstrBot, LangBot, or any
OpenAI-compatible caller at
http://localhost:8100/v1. The proxy forwards every
request to WeClone after injecting relevant Remnic memories into the
system prompt, and observes each response back into the store.
The proxy reads config from
~/.remnic/connectors/weclone.json. The installer
pre-fills sensible defaults — these are the knobs you'll most likely
want to change.
Field
Default
Description
wecloneApiUrl
http://localhost:8000/v1
Base URL of the WeClone API. Path-prefixed or bare origin both work.
proxyPort
8100
Local port the proxy listens on.
remnicDaemonUrl
http://localhost:4318
Remnic daemon URL. Change this if the daemon runs on a different host.
sessionStrategy
single
single shares one memory session; caller-id maps each caller to its own namespace via X-Caller-Id or body.user.
memoryInjection.maxTokens
1500
Approximate token budget for memory injected into the system prompt.
memoryInjection.position
system-append
Either system-append (append to existing system message) or system-prepend.
memoryInjection.template
Memory Context block
Template wrapping recalled memories. {memories} is the sole placeholder.
Building a multi-user bot? Set sessionStrategy to
caller-id and have your bot forward each user's ID as
X-Caller-Id. Memory stays partitioned per user with no
extra work.
Why Remnic's exports beat raw chat logs
WeClone normally trains on raw chat exports — Telegram, WeChat,
WhatsApp. Raw logs work but carry a lot of noise: one-word replies,
spam, PII, transient context. Fine-tuning on that noise teaches the
avatar to reproduce it.
Remnic has already distilled those same conversations into structured
facts, preferences, skills, decisions, and entities. The synthesizer
turns each record into a clean conversational Q/A pair using
category-driven templates:
category: preference
memory: "Dark roast coffee, Ethiopian Yirgacheffe specifically"
tags: food, coffee
generated pair:
instruction: "What kind of food, coffee do you like?"
output: "Dark roast coffee, Ethiopian Yirgacheffe specifically"
The result: a much higher signal-to-noise training set. Optional
style-marker extraction pulls the user's typical sentence length,
emoji usage, formality, and common phrases from source transcripts
so the generated output matches their tone too.
Privacy & security
Local by default. Every package runs on your machine. Training data is written to disk, not uploaded.
Owner-only config. The proxy config carries a Remnic bearer token and is written with 0o600 permissions.
Hop-by-hop header stripping. The proxy strips connection, keep-alive, proxy-authorization, and other hop-by-hop headers on both forward and response paths so credentials never leak upstream or downstream.
Double PII sweep on export. Remnic's own privacy controls run during extraction; sweepPii runs again on the training dataset as a second-chance filter before the JSON hits disk.
Symlink guard. The core converter refuses to follow symlinks or hard-linked .md files under memoryDir to block exfiltration vectors.
Common questions
Do I have to use all three packages?
No. They compose but aren't interlocked. Install only
connector-weclone if you already have a fine-tuned
avatar and just want to add memory. Install only
import-weclone if you want to seed Remnic from chat
history for another use case (your coding agents, for example).
Install only export-weclone if you want to train a
new model from Remnic without deploying a proxy.
What's the difference between this and plain WeClone?
Plain WeClone gives you a model that talks like you at the time
the training data was captured and has no memory between
conversations. Remnic adds two things: much cleaner training
distilled from structured memories (not noisy logs), and
persistent memory at inference time so the avatar remembers
what was said last week, last month, or last year.
Which chat platforms are supported for import?
Telegram, WhatsApp, Discord, and Slack, via WeClone's existing
preprocessing pipeline. You run WeClone's preprocessor first
(which handles platform-specific parsing and Presidio-based PII
redaction), then point remnic engram bulk-import
at the resulting JSON.
Does the proxy work with OpenAI-compatible clients other than WeClone?
The proxy is specifically designed to sit in front of a WeClone API
server, but because it speaks OpenAI's chat-completions format on
both sides, it will work in front of any OpenAI-compatible backend
(vLLM, LM Studio, text-generation-webui with the OpenAI extension,
etc.). The memory injection and observation are upstream-agnostic.
What happens if the Remnic daemon is down?
The proxy degrades gracefully: it logs the failure and forwards the
request to WeClone without memory injection, so the
caller always gets a response. observe is
fire-and-forget so a failed write never blocks the reply either.
Why does my imported memory feel "background-weight"?
That's deliberate. Bulk imports tag every memory with
trustLevel: "import" so a historical backfill (which
can be tens of thousands of messages) doesn't drown out organic
memories created during live sessions. Imported memories still
surface in recall — they just carry less weight.