“As Boris Cherny put it: ‘I don’t prompt Claude anymore. I have loops running that prompt Claude.’
When I first read that, it hit me. And I’d been wasting months polishing one-shot prompts when the real unlock was building systems that prompt themselves.
That’s Loop Engineering. And Cobus Greyling’s repo (4,156★ in 21 days — that’s 197★/day) is the first complete toolkit that makes it practical. And not another prompt library either. Also not a ‘how to write better prompts’ tutorial. And it’s a full stack — CLI tools, 7 production patterns, and an MCP server — designed to shift you from writing prompts to designing control systems that orchestrate AI coding agents over time.
So I spent a full day running this through its paces on my Ryzen workstation. Here’s what Loop Engineering actually is, why it’s the next step in AI agent workflows, and how to start using it today.
TL;DR: What Is Loop Engineering?
The short version: Loop engineering is the discipline of designing feedback-control systems that prompt AI agents, rather than writing individual prompts by hand.
Think of the difference:
- One-shot prompting: “Make this website faster.” Claude runs once, makes some changes, done.
- Loop engineering: A system that finds the slowest page, makes one focused improvement, measures the result, keeps the change if it helps, discards it if not, and repeats until the target is met. All without you touching a prompt.
But that’s the fundamental shift. So your job is no longer ‘prompt engineer’ — it’s ’loop designer.’ But you build the control system. And the agents do the work.
The 5 Building Blocks of an Agent Loop
Loop Engineering defines five primitives that make up any agent loop. Here’s how they map to a real example:
| Building Block | What It Does | Example in a “PR Babysitter” Loop |
|---|---|---|
| Automations & Scheduling | When does the loop trigger? | On every PR commit |
| Worktrees | Where does it run? | Fresh git worktree from the PR branch |
| Skills | What does the agent know? | Code review patterns, project style guide |
| MCP / Connectors | What tools can it use? | GitHub API, linters, test runner |
| Sub-agents | Who does what? | One agent reviews code, another runs tests |
| + Memory / State | What does it remember? | Previous review comments, issues found |
For a deeper look at the memory layer in agent workflows, I covered Recall: Fully-Local Project Memory for Claude Code a couple of weeks back.
But that’s it.
7 Production Patterns at a Glance
The repo ships with 7 ready-to-use patterns. Here’s what they cover:
| Pattern | Use Case | Best For |
|---|---|---|
| Daily Triage | Review unread issues, assign labels, suggest priorities | Maintainers drowning in notifications |
| PR Babysitter | Auto-review each commit, gate on test failures | CI-heavy teams |
| CI Sweeper | Detect flaky tests, flag regressions, suggest re-runs | Flaky CI pipelines |
| Dependency Sweeper | Check for stale deps, propose upgrades, verify compat | Keeping dependencies fresh |
| Changelog Drafter | Read merged PRs, auto-generate changelog entries | Release cycles |
| Post-Merge Cleanup | Delete temporary branches, close resolved issues | Housekeeping |
| Issue Triage | Scan new issues, classify by type/severity, suggest assignee | High-traffic repos |
But the two I found most immediately useful were Daily Triage and PR Babysitter — these handle the kind of low-level, repetitive work that burns developer time every day.
Hands-On: CLI Tools in Action
So the toolkit ships with four CLI tools. And I ran three of them on a personal project to see how they actually feel.
loop-init — scaffold a pattern in 30 seconds
npx @cobusgreyling/loop-init . --pattern daily-triage --tool claude-code
And it creates a .loop/ directory in your project with the full prompt chain, configuration, and metadata. Took me maybe 20 seconds. And the interactive mode also lets you pick from the 7 patterns with a nice terminal picker — no need to remember names.
loop-cost — estimate token spend before you run
npx @cobusgreyling/loop-cost --pattern daily-triage --level L1
Output: a breakdown of estimated tokens per cycle. For the daily-triage pattern at L1, it estimated ~8,200 tokens per run. So that’s about $0.04-0.08 with Claude Sonnet. Worth knowing before you accidentally schedule a loop that costs $50/day.
loop-audit — check your project’s loop readiness
npx @cobusgreyling/loop-audit . --suggest
And it scanned my project and flagged missing GitHub tokens, unconfigured skill paths, and suggested a reasonable daily-triage budget. And it also has a --badge flag that generates a “Loop Ready” badge for your README.
So the workflow is: loop-cost to plan → loop-init to scaffold → loop-audit to verify. I ran through all three in about 5 minutes.
Loop Engineering vs. Loopy vs. Agent Skills
So this space is moving fast. Still, three projects have emerged in the last month alone, and they solve different problems:
| Dimension | Loop Engineering | Loopy (Forward-Future) | Agent Skills (addyosmani) |
|---|---|---|---|
| Core focus | Toolkit + pattern library + CLI + MCP | Library of loop prompts + agent skill | Agent skill marketplace |
| CLI tools | ✅ loop-audit, loop-init, loop-cost, loop-sync | ❌ (npx skills add only) | ❌ |
| MCP Server | ✅ Loop MCP Server for runtime lookup | ❌ | ❌ |
| Patterns | 7 production patterns | Community catalog | Skill catalog |
| Cross-tool | Grok, Claude Code, Codex, Cursor, GitHub Actions | Codex, Cursor | Claude Code |
| Setup | npx (zero dependencies) | npx skills add | npx skills add |
| Growth rate | 4,156★ / 21 days = 197★/day | 2,170★ / 18 days = 120★/day | 52,500★ (long-established) |
But the key difference: Loop Engineering isn’t a prompt library. It’s a complete toolkit — CLI audit tools, pattern scaffolding, MCP server, and a security framework. Loopy is more of a beautiful prompt directory without CLI tools or deployable components. Both have their place, but if you want to actually operationalize loops, Loop Engineering gives you the infrastructure.
Deploying the MCP Server for Runtime Lookups
Here’s one feature I didn’t expect: the MCP server. node tools/mcp-server/dist/index.js starts a server that provides runtime pattern and skill lookups — your loop can query its own pattern library mid-run.
This is where the DigitalOcean recommendation comes in. If you want this running 24/7 — say, a PR Babysitter loop that keeps reviewing commits even when your laptop is closed — you need a small VPS. I covered the setup in my LFG review — same $6 Droplet tier works perfectly for loop infrastructure. A $6/month Droplet is plenty for the MCP server plus any scheduled loops.
Limitations You Should Know
I’d be lying if I said this was all smooth sailing. A few things to watch for:
But token costs can surprise you. At L3 fidelity, a single triage run can chew through 30K+ tokens. If you schedule this to run hourly on a busy repo, you’ll notice the bill. Use loop-cost before you commit to a schedule.
Human gates are non-negotiable. The PR Babysitter pattern is designed to review code, not auto-merge it. If you let loops make production changes without human approval, you’re asking for trouble. The repo’s security model handles this with denylists and auto-merge gates, but the guardrails are only as good as your configuration.
Comprehension debt is real. If you set up five loops and never read their output, you’ll accumulate technical debt faster than you accumulate automation cred. Loops need monitoring. The loop-audit tool helps, but it can’t tell you whether the automated decisions are correct.
Who Should Use This?
- You’re already using Claude Code, Codex, or Cursor and feel like manual prompting is the bottleneck
- You maintain open-source projects and want automated issue triage or PR babysitting
- Your CI pipeline has flaky tests and you want a loop that detects and flags them automatically
- You dislike repetitive dev workflows — changelogs, dependency updates, post-merge cleanup
Skip this if you’re still getting comfortable with basic AI coding assistance. Loop engineering assumes you’ve already hit the ceiling on manual prompting.
The Bottom Line
Loop Engineering is the first toolkit I’ve seen that takes the concept of ‘design loops, not prompts’ from abstract advice to something you can actually use on Monday morning. The CLI tools are polished, the patterns cover real pain points, and the documentation is some of the best I’ve seen in the AI tooling space.
My verdict: If you’re past the beginner stage with AI coding agents, this will change how you think about your workflow. Not dramatically — fundamentally. The question shifts from ‘what should I prompt?’ to ‘what loops should I design?’
As Addy Osmani put it: “Build the loop. But build it like someone who intends to stay the engineer — not just the person who presses go.”
Deploy Your Own Agent Loops
Disclosure: Some links below are affiliate links. If you sign up through them, I may earn a commission at no extra cost to you.
If you want to run Loop Engineering 24/7 — PR Babysitter, Daily Triage, or any scheduled loop — you need a small VPS that stays on even when your laptop's closed:
- DigitalOcean — $200 free credit for 60 days, deploy a $6/month Droplet in under a minute. Enough to run the MCP server plus multiple scheduled loops.
- Vultr — Alternative with $100 trial credit, 32 global datacenter locations, deploy a cloud VPS in 30 seconds.
Recommended Reading: Building LLM Powered Applications: Create Intelligent Apps and Agents with Large Language Models — Go deeper into the architecture behind agent loops and LLM-powered systems.