self-learning-skills: An AI Skill That Learns on the Job

Ever spent half a session re-teaching your AI agent how to reach the prod DB, where the credentials live, and the exact deploy command — only to have it forget everything the moment you close the terminal? Yeah, me too.

I’ve been running Claude Code and Cursor daily for months now. And the single biggest frustration isn’t the code quality. It’s that every session starts from zero. The hard-won knowledge from debugging that flaky CI job or discovering the one weird config quirk that breaks staging — poof, gone when the session ends.

That’s the problem self-learning-skills solves. It’s a meta-skill that watches your agent work, recognizes when it’s earned a reusable “golden path,” and auto-harvests that knowledge into a persistent skill for next time. 742 GitHub stars in four days, active development, MIT license. So let’s see if it lives up to the hype.

How self-learning-skills Works: Recognize → Capture → Reuse Loop

The core idea is deceptively simple. Your agent works normally. But when it hits a moment that matters — a non-obvious command that finally worked, a project fact that took three tries to uncover, an operational workflow you know you’ll repeat — self-learning-skills detects it and acts:

Recognize — The skill watches for cues: multiple failed attempts before success, a command you had to look up, or you explicitly saying “remember this.”
Capture — It saves the procedure (not just the answer) plus a “what didn’t work” note. No prompting required — it picks the scope and name automatically and tells you what it captured.
Reuse — Next session, the saved skill loads automatically. Your agent already knows the route.

I installed it on my MacBook Air M3 with a single command:

npx skills add kulaxyz/self-learning-skills -g

And it took about 30 seconds. And the CLI auto-detected Claude Code and Cursor on my machine and set up the right persistence paths for both.

Live Demo: Debugging a Flaky CI Test

Here’s where it gets interesting. I had this Python project where the CI pipeline kept failing on a specific integration test — flaky DNS resolution that only happened in the GitHub Actions runner. And I’d debugged it three times over two weeks, each time starting from scratch.

So with self-learning-skills installed, I started a new Claude Code session and said: “The test_webhook_delivery CI job is flaky — it fails about 30% of the time with a DNS timeout. Find the root cause.”

Claude dug through the codebase, found the httpx client with no timeout config, cross-referenced the CI logs, and nailed the fix in about 12 minutes. What happened next surprised me: the skill auto-created a skills/ci-dns-timeout-workaround/SKILL.md entry.

Next session, I asked “check CI flakiness” — and Claude already knew about the DNS timeout issue, the exact file to patch, and the timeout=30 fix that worked. No re-explanation needed.

Here’s what the harvested skill looked like:

Field	Content
What worked	Set `httpx.Client(timeout=30.0)` in `tests/conftest.py`
What didn’t work	Retry logic inside the test — DNS was failing at the transport layer
Context	CI runner DNS resolution is slower than local dev
Confidence	Verified — fix passed 5 consecutive CI runs

That “what didn’t work” line is the killer feature. Most memory tools just store what happened. But this one stores the negative knowledge too — the dead ends you don’t want to re-explore.

Tool-by-Tool: How self-learning-skills Persists Knowledge

The skill adapts its persistence strategy depending on which agent you use:

Tool	Persists to	Auto-loads via
Claude Code / Codex	`skills/<name>/SKILL.md`	Skill description matching
Cursor	`.cursor/rules/learned/<name>.mdc`	Rule description / globs
Zed / Aider / Gemini CLI	`AGENTS.md` (or project notes)	Always-read instructions

One install covers all of them. The -g flag makes it global across all your projects. So if you want per-project setup, drop the flag.

Where self-learning-skills Fits in Your Agent Stack

Still, self-learning-skills isn’t a replacement for tools like addyosmani/agent-skills or claude-mem — it sits alongside them.

vs. agent-skills (68k★): agent-skills gives you a curated collection of production-grade engineering skills — logging patterns, testing conventions, deployment templates. It’s a library of generic best practices. self-learning-skills captures your project’s specific quirks — the custom deploy script, the non-standard config path, the known flaky test. So they’re complementary, not competitors. Install both.

vs. claude-mem (85k★): claude-mem persists raw session memory — context, conversations, observations. It’s great for remembering “I was working on X” across sessions. But it stores everything in a flat pile. self-learning-skills creates structured, task-specific procedural knowledge — named skills with clear scope, verified steps, and annotated failure patterns. In my testing, I use claude-mem for ambient context and self-learning-skills for actual workflows I want automated — my take on local memory alternatives is in the Recall review.

vs. manual AGENTS.md: Before this, I maintained AGENTS.md by hand — updating it when I remembered, which was never often enough. And the auto-capture here is the real upgrade — I wrote about the broader agent experience-sharing concept in my Agent Apprenticeship deep dive. It catches things you’d forget to document.

Limitations & Caveats

I have to be honest — this is a young project. Still, 742 stars and 7 forks means the community is small. But the maintainer is actively pushing updates (last commit was 20 hours ago as of writing), and there’s no SaaS version, no GUI, no enterprise support.

Set-up also requires some agent config knowledge. So if you’ve never touched a CLINE.md or AGENTS.md file before, you’ll need to learn the basics first. It’s not zero-config — closer to “minimal config if you know what you’re doing.”

And the auto-capture can miss nuance. I noticed that when the debugging path was really convoluted — like 15-plus steps with branching decisions — the captured skill would sometimes over-simplify. Still, manual curation matters for edge cases.

The Bottom Line

self-learning-skills is one of those tools that feels obvious once you’ve used it. But the “Groundhog Day” problem is real, and this is the cleanest solution I’ve seen — not a curated library, not a raw memory dump, but a smart middle ground that captures how work gets done in your specific project. So if you use Claude Code, Cursor, or Codex daily, it’s worth the 30-second install just to see what it catches. And you’ll probably start wondering why your agent didn’t do this already.

Disclosure: Some links below are affiliate links. If you sign up through them, I may earn a commission at no extra cost to you.

📚 Deepen Your AI Agent Knowledge

If you’re building AI agents and want the theoretical foundation behind tools like self-learning-skills, Building LLM Powered Applications by Pramod Alto covers the fundamentals — from prompt engineering and retrieval-augmented generation to multi-agent orchestration patterns. It’s the architectural thinking that makes tools like this possible.

→ Building LLM-Powered Applications on Amazon

For the Bottom Line: Self-learning-skills handles the how — capturing workflow knowledge automatically. This book handles the what — designing and building production-grade LLM applications from scratch. Together they cover both sides of the AI development stack.

→ Check price on Amazon ($30–60)

How self-learning-skills Works: Recognize → Capture → Reuse Loop#

Live Demo: Debugging a Flaky CI Test#

Tool-by-Tool: How self-learning-skills Persists Knowledge#

Where self-learning-skills Fits in Your Agent Stack#

Limitations & Caveats#

The Bottom Line#