Ever started a fresh Claude Code session and spent the first 50 tokens explaining the same project context you laid out yesterday? Me too. And it’s the cold-start tax — on a paid subscription those losses add up fast. So I’ve tried CLAUDE.md (manual upkeep), –continue (full transcript replay, token-heavy), claude-mem (my earlier review), and external memory tools (API calls, data leaves your machine). All work, but none solve the problem cleanly.

Then I stumbled on recall — a 3-day-old project with 216 stars. Here is a Claude Code plugin that auto-captures your session and summarizes it locally, zero LLM calls and zero outbound traffic. Worth testing.

What Recall Does for Claude Code Memory

Recall hooks into Claude Code’s SessionStart and SessionEnd events. While you work, it writes every turn into .recall/history.md — an append-only, local log. Now when you’re done, you run /recall:save, and a classical Python summarizer (TF-IDF + TextRank, fully offline) condenses the session into .recall/context.md: a ~1-2K token resume with the goal, summary, files touched, and next steps.

So next session? Claude asks “Resume from saved context?” — say yes, and you’re back exactly where you stopped. Zero tokens spent on re-orientation.

And the key differentiator: that summarization algorithm. No LLM call, no API key, no external model. Just TF-IDF sentence vectors, a cosine-similarity graph, and TextRank power iteration. And if numpy is available it runs vectorized; if not, an identical pure-Python path handles it.

How recall Compares to Other Memory Options

FeatureCLAUDE.md / #–continue / –resumemempalace / mem0recall
UpkeepManualNoneAutomaticAutomatic
Resumption costSmall (tokens)Large (full replay)External API call~1-2K tokens, zero LLM cost
PrivacyLocalLocalSends data externallyEntirely offline
PortabilityGit-friendlyMachine-localCloud-dependentPlaintext, diffable
How Claude treats itInstructionsConversation historyExternal contextFenced reference data

But what I like isn’t that recall wins on paper — it’s the privacy guarantee column. So I tested mempalace and mem0 before, and the moment I realized session transcripts were hitting an API endpoint, I pulled the plug. Recall’s README explicitly documents no network calls, no credentials, and no ANTHROPIC_* references. That’s a concrete promise, not marketing copy.

Installing and Testing Recall

So I tested this on my Ryzen 9 workstation running Claude Code on a Pro subscription. The install is refreshingly simple:

/plugin marketplace add raiyanyahya/recall
/plugin install recall@recall

And two commands. No pip install, no environment variables, no config file to write. Took me about 30 seconds.

Then I ran a full afternoon session building a FastAPI endpoint — about 40 turns. At the end I ran /recall:save. The summarizer finished almost instantly (pure-Python path, no numpy in that venv). The resulting context.md was 1,847 bytes — compact enough to load into any new session without thinking about token budgets.

And the next session, Claude asked: “I see saved context from your last session — would you like to resume?” Clicked yes, and it picked up the thread without me typing a single explanation. Honestly? That’s the kind of friction-removal that makes Claude Code feel like a continuous pair programmer instead of a series of cold-start chats.

What to Watch Out For

But recall is brand new — last commit was three days ago. At 4 forks and 216 stars, the community around it is tiny. That means fewer edge cases tested, fewer issues to learn from, and no guarantee of long-term maintenance. Still, the summarizer’s output quality depends on well-structured session transcripts; I noticed one session where the extracted summary didn’t capture the debugging dead-end I chased down, which could mislead a teammate reading the shared context.md.

One more thing though — if you commit .recall/ to a shared repo, the context.md becomes a prompt-injection surface. The plugin fences it as untrusted reference data, but personally I’d keep it gitignored unless I fully trust every contributor.

Bottom Line

Still, recall is the simplest session-memory workflow I’ve tested for Claude Code. Two commands to install, zero tokens spent on memory, and your data stays on your machine. It’s early — very early — but the approach is sound, the privacy guarantee is real, and the cost (free, on top of your existing subscription) makes it a no-brainer to try.

So if the cold-start problem bugs you even a little, spend 30 seconds installing recall. You’ve got nothing to lose but your daily token overhead.

Disclosure: Some links below are affiliate links. If you sign up through them, I may earn a commission at no extra cost to you.