Agent Skills: 7 Commands That Make AI Agents Write Prod Code

Thu, 11 Jun 2026 00:00:00 +0000

AI coding agents are incredible at generating code fast. But ask them to write a spec before touching the keyboard? Review their own output before shipping? Crickets.

The problem isn’t the models — it’s that agents lack structured engineering workflows. They jump straight to code. No plan. No tests. No review. It works for small scripts, but for anything production-grade, that shortcut burns you sooner or later.

That’s exactly what agent-skills (52.5k★ on GitHub, trending #1 as I type this) is built to fix. Addy Osmani — yes, the Google Chrome engineering manager who wrote the book on JavaScript patterns — put together 7 slash commands that encode senior engineer workflows for Claude Code, Cursor, Gemini CLI, and OpenCode. Install once, and your agent suddenly knows how to spec-first, test-drive, and review-before-merge. Like having a staff engineer sitting next to your terminal.

This pairs nicely with today’s claude-mem review. If claude-mem gives your agent a memory, agent-skills gives it a work ethic. Together they form a pretty complete coding agent setup.

The 7 Commands

Here’s the full pipeline. So each command maps to a real engineering phase:

Command	Phase	What It Does
`/spec`	Define	Writes a detailed specification before any code. Defines scope, inputs, outputs, edge cases
`/plan`	Plan	Breaks the spec into small, atomic implementation tasks
`/build`	Build	Implements one slice at a time — no more 500-line monster PRs
`/test`	Verify	Generates tests alongside the code. Tests are proof the code works
`/review`	Review	Reviews the code you just wrote — catches issues before you merge
`/code-simplify`	Simplify	Refactors for clarity. “Clarity over cleverness” is the guiding principle
`/ship`	Deploy	Prepares the final commit. Change log, summary, ready to push

I installed this on a personal React + FastAPI side project I’ve been building with Claude Code. My workflow before agent-skills was basically: type a prompt → Claude generates code → I manually review → ship. But about 40% of the time I’d catch something the second pass — missing edge cases, no error handling, tests that only exercise the happy path.

After setting up agent-skills, I ran through the full pipeline on a new API endpoint. Here’s what that looked like.

Quick Start — One Command

For Claude Code users, the install is dead simple:

/plugin marketplace add addyosmani/agent-skills
/plugin install agent-skills@addy-agent-skills

That’s it. Took me about 15 seconds. Still, in Cursor you drop the SKILL.md into .cursor/rules/. Gemini CLI and OpenCode have native support via .gemini/commands and .opencode/ directories — so no extra config needed.

Walking Through a Real Workflow

I picked a concrete task: “Add a WebSocket endpoint for real-time logs to the monitoring dashboard.” Something that touches frontend, backend, and state management — classic multi-layer work that usually trips agents up.

Step 1 — /spec: The agent asked clarifying questions about reconnection strategy, auth requirements, and log retention. Then it wrote a 3-page spec covering the full WebSocket lifecycle. That took about 2 minutes. But here’s the kicker — before agent-skills, I would’ve started coding immediately and discovered these edge cases three refactors later.

Step 2 — /plan: The spec got broken into 5 atomic tasks: (1) backend connection manager, (2) WebSocket handler, (3) Frontend hook, (4) UI component, (5) integration test. And each was small enough to complete in one session — the kind of granularity that makes a real difference when you’re context-switching.

Step 3 — /build through /ship: I ran each command in sequence. Now, the /test step generated integration tests that caught a race condition I would’ve missed — the WebSocket reconnection loop was firing twice on a dropped connection. Even the /review step flagged an unused import and an unhandled error path. Though what surprised me most was /code-simplify — it collapsed a 40-line state machine into 25 lines with clearer logic.

Total time for the full pipeline: about 45 minutes. Still, the same task without this structure would’ve taken me 20 minutes of coding plus 30 minutes of debugging the stuff I missed. So roughly the same total time — but the output was significantly higher quality, with test coverage baked in from the start.

What to Watch Out For

Agent-skills is young. The commands work well for the happy path, but complex edge cases — like nested sub-tasks or cross-repository refactors — can produce specs that are overly verbose or plans that miss downstream dependencies. I noticed the /plan output sometimes creates tasks that overlap, especially when the change touches multiple files.

Also, this only makes sense if you already use AI coding agents regularly. If you’re not on Claude Code, Cursor, or Gemini CLI, agent-skills has nothing to plug into. And if you’re a solo dev who prefers to code directly without agent assistance, this whole layer of indirection might feel like overhead.

One more thing — the project moves fast. At 821 stars per day and a last commit 4 hours ago, expect breaking changes as the skill packs evolve. Still, the fundamentals (7 commands) are stable, and the exact behavior will only sharpen over time.

Bottom Line