“As Boris Cherny put it: ‘I don’t prompt Claude anymore. I have loops running that prompt Claude.’

When I first read that, it hit me. And I’d been wasting months polishing one-shot prompts when the real unlock was building systems that prompt themselves.

That’s Loop Engineering. And Cobus Greyling’s repo (4,156★ in 21 days — that’s 197★/day) is the first complete toolkit that makes it practical. And not another prompt library either. Also not a ‘how to write better prompts’ tutorial. And it’s a full stack — CLI tools, 7 production patterns, and an MCP server — designed to shift you from writing prompts to designing control systems that orchestrate AI coding agents over time.

So I spent a full day running this through its paces on my Ryzen workstation. Here’s what Loop Engineering actually is, why it’s the next step in AI agent workflows, and how to start using it today.

TL;DR: What Is Loop Engineering?

The short version: Loop engineering is the discipline of designing feedback-control systems that prompt AI agents, rather than writing individual prompts by hand.

Think of the difference:

  • One-shot prompting: “Make this website faster.” Claude runs once, makes some changes, done.
  • Loop engineering: A system that finds the slowest page, makes one focused improvement, measures the result, keeps the change if it helps, discards it if not, and repeats until the target is met. All without you touching a prompt.

But that’s the fundamental shift. So your job is no longer ‘prompt engineer’ — it’s ’loop designer.’ But you build the control system. And the agents do the work.

The 5 Building Blocks of an Agent Loop

Loop Engineering defines five primitives that make up any agent loop. Here’s how they map to a real example:

Building BlockWhat It DoesExample in a “PR Babysitter” Loop
Automations & SchedulingWhen does the loop trigger?On every PR commit
WorktreesWhere does it run?Fresh git worktree from the PR branch
SkillsWhat does the agent know?Code review patterns, project style guide
MCP / ConnectorsWhat tools can it use?GitHub API, linters, test runner
Sub-agentsWho does what?One agent reviews code, another runs tests
+ Memory / StateWhat does it remember?Previous review comments, issues found

For a deeper look at the memory layer in agent workflows, I covered Recall: Fully-Local Project Memory for Claude Code a couple of weeks back.

But that’s it.

7 Production Patterns at a Glance

The repo ships with 7 ready-to-use patterns. Here’s what they cover:

PatternUse CaseBest For
Daily TriageReview unread issues, assign labels, suggest prioritiesMaintainers drowning in notifications
PR BabysitterAuto-review each commit, gate on test failuresCI-heavy teams
CI SweeperDetect flaky tests, flag regressions, suggest re-runsFlaky CI pipelines
Dependency SweeperCheck for stale deps, propose upgrades, verify compatKeeping dependencies fresh
Changelog DrafterRead merged PRs, auto-generate changelog entriesRelease cycles
Post-Merge CleanupDelete temporary branches, close resolved issuesHousekeeping
Issue TriageScan new issues, classify by type/severity, suggest assigneeHigh-traffic repos

But the two I found most immediately useful were Daily Triage and PR Babysitter — these handle the kind of low-level, repetitive work that burns developer time every day.

Hands-On: CLI Tools in Action

So the toolkit ships with four CLI tools. And I ran three of them on a personal project to see how they actually feel.

loop-init — scaffold a pattern in 30 seconds

npx @cobusgreyling/loop-init . --pattern daily-triage --tool claude-code

And it creates a .loop/ directory in your project with the full prompt chain, configuration, and metadata. Took me maybe 20 seconds. And the interactive mode also lets you pick from the 7 patterns with a nice terminal picker — no need to remember names.

loop-cost — estimate token spend before you run

npx @cobusgreyling/loop-cost --pattern daily-triage --level L1

Output: a breakdown of estimated tokens per cycle. For the daily-triage pattern at L1, it estimated ~8,200 tokens per run. So that’s about $0.04-0.08 with Claude Sonnet. Worth knowing before you accidentally schedule a loop that costs $50/day.

loop-audit — check your project’s loop readiness

npx @cobusgreyling/loop-audit . --suggest

And it scanned my project and flagged missing GitHub tokens, unconfigured skill paths, and suggested a reasonable daily-triage budget. And it also has a --badge flag that generates a “Loop Ready” badge for your README.

So the workflow is: loop-cost to plan → loop-init to scaffold → loop-audit to verify. I ran through all three in about 5 minutes.

Loop Engineering vs. Loopy vs. Agent Skills

So this space is moving fast. Still, three projects have emerged in the last month alone, and they solve different problems:

DimensionLoop EngineeringLoopy (Forward-Future)Agent Skills (addyosmani)
Core focusToolkit + pattern library + CLI + MCPLibrary of loop prompts + agent skillAgent skill marketplace
CLI tools✅ loop-audit, loop-init, loop-cost, loop-sync❌ (npx skills add only)
MCP Server✅ Loop MCP Server for runtime lookup
Patterns7 production patternsCommunity catalogSkill catalog
Cross-toolGrok, Claude Code, Codex, Cursor, GitHub ActionsCodex, CursorClaude Code
Setupnpx (zero dependencies)npx skills addnpx skills add
Growth rate4,156★ / 21 days = 197★/day2,170★ / 18 days = 120★/day52,500★ (long-established)

But the key difference: Loop Engineering isn’t a prompt library. It’s a complete toolkit — CLI audit tools, pattern scaffolding, MCP server, and a security framework. Loopy is more of a beautiful prompt directory without CLI tools or deployable components. Both have their place, but if you want to actually operationalize loops, Loop Engineering gives you the infrastructure.

Deploying the MCP Server for Runtime Lookups

Here’s one feature I didn’t expect: the MCP server. node tools/mcp-server/dist/index.js starts a server that provides runtime pattern and skill lookups — your loop can query its own pattern library mid-run.

This is where the DigitalOcean recommendation comes in. If you want this running 24/7 — say, a PR Babysitter loop that keeps reviewing commits even when your laptop is closed — you need a small VPS. I covered the setup in my LFG review — same $6 Droplet tier works perfectly for loop infrastructure. A $6/month Droplet is plenty for the MCP server plus any scheduled loops.

Limitations You Should Know

I’d be lying if I said this was all smooth sailing. A few things to watch for:

But token costs can surprise you. At L3 fidelity, a single triage run can chew through 30K+ tokens. If you schedule this to run hourly on a busy repo, you’ll notice the bill. Use loop-cost before you commit to a schedule.

Human gates are non-negotiable. The PR Babysitter pattern is designed to review code, not auto-merge it. If you let loops make production changes without human approval, you’re asking for trouble. The repo’s security model handles this with denylists and auto-merge gates, but the guardrails are only as good as your configuration.

Comprehension debt is real. If you set up five loops and never read their output, you’ll accumulate technical debt faster than you accumulate automation cred. Loops need monitoring. The loop-audit tool helps, but it can’t tell you whether the automated decisions are correct.

Who Should Use This?

  • You’re already using Claude Code, Codex, or Cursor and feel like manual prompting is the bottleneck
  • You maintain open-source projects and want automated issue triage or PR babysitting
  • Your CI pipeline has flaky tests and you want a loop that detects and flags them automatically
  • You dislike repetitive dev workflows — changelogs, dependency updates, post-merge cleanup

Skip this if you’re still getting comfortable with basic AI coding assistance. Loop engineering assumes you’ve already hit the ceiling on manual prompting.

The Bottom Line

Loop Engineering is the first toolkit I’ve seen that takes the concept of ‘design loops, not prompts’ from abstract advice to something you can actually use on Monday morning. The CLI tools are polished, the patterns cover real pain points, and the documentation is some of the best I’ve seen in the AI tooling space.

My verdict: If you’re past the beginner stage with AI coding agents, this will change how you think about your workflow. Not dramatically — fundamentally. The question shifts from ‘what should I prompt?’ to ‘what loops should I design?’

As Addy Osmani put it: “Build the loop. But build it like someone who intends to stay the engineer — not just the person who presses go.”

Deploy Your Own Agent Loops

Disclosure: Some links below are affiliate links. If you sign up through them, I may earn a commission at no extra cost to you.

If you want to run Loop Engineering 24/7 — PR Babysitter, Daily Triage, or any scheduled loop — you need a small VPS that stays on even when your laptop's closed:

  • DigitalOcean$200 free credit for 60 days, deploy a $6/month Droplet in under a minute. Enough to run the MCP server plus multiple scheduled loops.
  • Vultr — Alternative with $100 trial credit, 32 global datacenter locations, deploy a cloud VPS in 30 seconds.

Recommended Reading: Building LLM Powered Applications: Create Intelligent Apps and Agents with Large Language Models — Go deeper into the architecture behind agent loops and LLM-powered systems.