Recall: Fully-Local Project Memory for Claude Code

Ever started a fresh Claude Code session and spent the first 50 tokens explaining the same project context you laid out yesterday? Me too. And it’s the cold-start tax — on a paid subscription those losses add up fast. So I’ve tried CLAUDE.md (manual upkeep), –continue (full transcript replay, token-heavy), claude-mem (my earlier review), and external memory tools (API calls, data leaves your machine). All work, but none solve the problem cleanly.

Then I stumbled on recall — a 3-day-old project with 216 stars. Here is a Claude Code plugin that auto-captures your session and summarizes it locally, zero LLM calls and zero outbound traffic. Worth testing.

What Recall Does for Claude Code Memory

Recall hooks into Claude Code’s SessionStart and SessionEnd events. While you work, it writes every turn into .recall/history.md — an append-only, local log. Now when you’re done, you run /recall:save, and a classical Python summarizer (TF-IDF + TextRank, fully offline) condenses the session into .recall/context.md: a ~1-2K token resume with the goal, summary, files touched, and next steps.

So next session? Claude asks “Resume from saved context?” — say yes, and you’re back exactly where you stopped. Zero tokens spent on re-orientation.

And the key differentiator: that summarization algorithm. No LLM call, no API key, no external model. Just TF-IDF sentence vectors, a cosine-similarity graph, and TextRank power iteration. And if numpy is available it runs vectorized; if not, an identical pure-Python path handles it.

How recall Compares to Other Memory Options

Feature	CLAUDE.md / #	–continue / –resume	mempalace / mem0	recall
Upkeep	Manual	None	Automatic	Automatic
Resumption cost	Small (tokens)	Large (full replay)	External API call	~1-2K tokens, zero LLM cost
Privacy	Local	Local	Sends data externally	Entirely offline
Portability	Git-friendly	Machine-local	Cloud-dependent	Plaintext, diffable
How Claude treats it	Instructions	Conversation history	External context	Fenced reference data

But what I like isn’t that recall wins on paper — it’s the privacy guarantee column. So I tested mempalace and mem0 before, and the moment I realized session transcripts were hitting an API endpoint, I pulled the plug. Recall’s README explicitly documents no network calls, no credentials, and no ANTHROPIC_* references. That’s a concrete promise, not marketing copy.

Installing and Testing Recall

So I tested this on my Ryzen 9 workstation running Claude Code on a Pro subscription. The install is refreshingly simple:

/plugin marketplace add raiyanyahya/recall
/plugin install recall@recall

And two commands. No pip install, no environment variables, no config file to write. Took me about 30 seconds.

Then I ran a full afternoon session building a FastAPI endpoint — about 40 turns. At the end I ran /recall:save. The summarizer finished almost instantly (pure-Python path, no numpy in that venv). The resulting context.md was 1,847 bytes — compact enough to load into any new session without thinking about token budgets.

And the next session, Claude asked: “I see saved context from your last session — would you like to resume?” Clicked yes, and it picked up the thread without me typing a single explanation. Honestly? That’s the kind of friction-removal that makes Claude Code feel like a continuous pair programmer instead of a series of cold-start chats.

What to Watch Out For

But recall is brand new — last commit was three days ago. At 4 forks and 216 stars, the community around it is tiny. That means fewer edge cases tested, fewer issues to learn from, and no guarantee of long-term maintenance. Still, the summarizer’s output quality depends on well-structured session transcripts; I noticed one session where the extracted summary didn’t capture the debugging dead-end I chased down, which could mislead a teammate reading the shared context.md.

One more thing though — if you commit .recall/ to a shared repo, the context.md becomes a prompt-injection surface. The plugin fences it as untrusted reference data, but personally I’d keep it gitignored unless I fully trust every contributor.

Bottom Line

Still, recall is the simplest session-memory workflow I’ve tested for Claude Code. Two commands to install, zero tokens spent on memory, and your data stays on your machine. It’s early — very early — but the approach is sound, the privacy guarantee is real, and the cost (free, on top of your existing subscription) makes it a no-brainer to try.

So if the cold-start problem bugs you even a little, spend 30 seconds installing recall. You’ve got nothing to lose but your daily token overhead.

Disclosure: Some links below are affiliate links. If you sign up through them, I may earn a commission at no extra cost to you.

Vultr — starts at $6/mo
DigitalOcean — $200 credit for new users

What Recall Does for Claude Code Memory#

How recall Compares to Other Memory Options#

Installing and Testing Recall#

What to Watch Out For#

Bottom Line#

What Recall Does for Claude Code Memory

How recall Compares to Other Memory Options

Installing and Testing Recall

What to Watch Out For

Bottom Line