<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Umadev Review on ToolGenix — Open-Source AI &amp; Developer Tools: Honest Hands-On Reviews</title><link>https://toolgenix.nxtniche.com/tags/umadev-review/</link><description>Recent content in Umadev Review on ToolGenix — Open-Source AI &amp; Developer Tools: Honest Hands-On Reviews</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Thu, 25 Jun 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://toolgenix.nxtniche.com/tags/umadev-review/index.xml" rel="self" type="application/rss+xml"/><item><title>umadev Review: I Tested This Open-Source AI Project Director</title><link>https://toolgenix.nxtniche.com/posts/umadev-review-open-source-ai-project-director/</link><pubDate>Thu, 25 Jun 2026 00:00:00 +0000</pubDate><guid>https://toolgenix.nxtniche.com/posts/umadev-review-open-source-ai-project-director/</guid><description>I tested umadev — an open-source AI project director orchestrating Claude Code with a 9-role team, quality gates, and delivery proofs. Honest findings inside.</description><content:encoded><![CDATA[<p><em>Disclosure: I may earn a commission if you sign up through links on this page. This review is based on my own testing — no sponsor influence.</em></p>
<hr>
<p>You know the drill. You tell Claude Code to &ldquo;build a todo app with Postgres,&rdquo; it cranks out code in 90 seconds, and you think you&rsquo;re done. Then you look at what it actually built — mismatched API paths, hardcoded colors, placeholder images, and TODOs scattered through every file. It says &ldquo;done&rdquo; but it&rsquo;s not done.</p>
<p>And I&rsquo;ve been there more times than I can count. So when I found <strong>umadev</strong> — an open-source Rust project that claims to turn AI coding agents into a structured delivery team — I had to try it. 122 stars in its first week on GitHub. That&rsquo;s not a lot by absolute numbers, but for a tool this niche? It tells me developers feel the same pain.</p>
<p><strong>Here&rsquo;s the short version:</strong> umadev doesn&rsquo;t replace your AI coding CLI. It wraps around it — Claude Code, Codex, or OpenCode — and runs the show like a real project director. Think of it as the producer who yells &ldquo;cut&rdquo; when the scene isn&rsquo;t right, not the actor on stage.</p>
<h2 id="what-is-umadev">What Is umadev?</h2>
<p>So umadev is a single Rust binary (MIT license, v1.0.7) that you install via npm. Yes, npm for a Rust binary — it&rsquo;s a distribution shim, not a Node app. Under the hood it&rsquo;s pure Rust, cross-compiled for macOS, Linux, and Windows.</p>
<p>But the core idea is dead simple: you describe what you want in plain language, and umadev orchestrates your base coding agent through a structured delivery pipeline. <strong>The base does the coding. umadev does the directing.</strong></p>
<p>But it runs a 9-role review team that sounds excessive on paper but makes sense in practice:</p>
<table>
	<thead>
			<tr>
					<th style="text-align: left">Role</th>
					<th style="text-align: left">What They Do</th>
			</tr>
	</thead>
	<tbody>
			<tr>
					<td style="text-align: left"><strong>Director</strong></td>
					<td style="text-align: left">Owns the plan, drives the main session, aggregates all verdicts</td>
			</tr>
			<tr>
					<td style="text-align: left"><strong>Product Manager</strong></td>
					<td style="text-align: left">Checks scope, acceptance criteria, PRD completeness</td>
			</tr>
			<tr>
					<td style="text-align: left"><strong>Architect</strong></td>
					<td style="text-align: left">Reviews data model, APIs, scalability decisions</td>
			</tr>
			<tr>
					<td style="text-align: left"><strong>UI/UX Designer</strong></td>
					<td style="text-align: left">Enforces design tokens, typography, component states</td>
			</tr>
			<tr>
					<td style="text-align: left"><strong>Frontend Engineer</strong></td>
					<td style="text-align: left">Writes UI code and handles frontend reviews</td>
			</tr>
			<tr>
					<td style="text-align: left"><strong>Backend Engineer</strong></td>
					<td style="text-align: left">Writes server code and handles backend reviews</td>
			</tr>
			<tr>
					<td style="text-align: left"><strong>QA Engineer</strong></td>
					<td style="text-align: left">Checks test coverage, edge cases, runtime behavior</td>
			</tr>
			<tr>
					<td style="text-align: left"><strong>Security</strong></td>
					<td style="text-align: left">Runs SAST, secret scanning, auth pattern review</td>
			</tr>
			<tr>
					<td style="text-align: left"><strong>DevOps</strong></td>
					<td style="text-align: left">Validates Dockerfile, CI config, deployment setup</td>
			</tr>
	</tbody>
</table>
<p>Still, the key design choice here is that these roles <strong>never chat to each other</strong>. They coordinate exclusively through shared artifact files and structured verdicts. No infinite agent-slash-agent conversation loops. That alone makes me trust the architecture more than most &ldquo;multi-agent&rdquo; systems I&rsquo;ve tested.</p>
<h2 id="quick-start-heres-what-happened-when-i-ran-it">Quick Start: Here&rsquo;s What Happened When I Ran It</h2>
<p>So I installed umadev on my Ryzen 9 workstation running Ubuntu 24.04, with Claude Code already logged in:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-bash" data-lang="bash"><span style="display:flex;"><span>npm install -g umadev
</span></span></code></pre></div><p>And the npm install finished in about 12 seconds. Behind the scenes it pulled the Rust binary and a ~224 MB local embedding model (<code>multilingual-e5-small</code>) for offline vector search. No Docker, no Python deps, no API keys to configure.</p>
<p>Then I ran:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-bash" data-lang="bash"><span style="display:flex;"><span>umadev
</span></span></code></pre></div><p>First launch shows a clean TUI — markdown rendering, syntax-highlighted code, the works. It asked me to pick a backend from three options: Claude Code, Codex, or OpenCode. I picked Claude Code. That was it. No config files to edit.</p>
<p>So I gave it a test prompt:</p>
<pre tabindex="0"><code>build a todo app with a Postgres backend
</code></pre><p><strong>Here&rsquo;s what surprised me:</strong> But it didn&rsquo;t start coding immediately. Instead it showed an intent card — &ldquo;full build, entering the delivery flow&rdquo; — then spent roughly 40 seconds planning. It drafted three documents before a line of code was written: a PRD, an architecture doc, and a UI/UX spec. Each showed up in <code>output/</code> as markdown files. It paused for my review at the <code>docs_confirm</code> gate.</p>
<p>But that pause is important. Still, most AI coding tools rush straight to implementation. umadev forces a human checkpoint <em>before</em> code, which is exactly what I&rsquo;d do with a junior developer on my team.</p>
<p>After I approved the docs, it built an execution plan with dependency ordering, wrote the frontend (React + TypeScript), paused again for a live preview, then wrote the backend, ran a full quality gate, and handed me a delivery pack with a scorecard and proof archive.</p>
<p>Still, the whole thing took about 14 minutes for a basic todo app. Not fast compared to raw Claude Code output, but the output was <strong>actually shippable</strong> — no fake data, no mismatched routes, no random emoji icons.</p>
<h2 id="what-makes-it-different-from-raw-claude-code">What Makes It Different From Raw Claude Code</h2>
<p>Most people use Claude Code like a smart autocomplete — type a prompt, get code, fix issues one by one until it works. umadev enforces a completely different workflow:</p>
<table>
	<thead>
			<tr>
					<th style="text-align: left">Dimension</th>
					<th style="text-align: center">Raw Claude Code</th>
					<th style="text-align: center">umadev + Claude Code</th>
			</tr>
	</thead>
	<tbody>
			<tr>
					<td style="text-align: left"><strong>Planning</strong></td>
					<td style="text-align: center">None — starts coding immediately</td>
					<td style="text-align: center">PRD, architecture, UI/UX docs first</td>
			</tr>
			<tr>
					<td style="text-align: left"><strong>Quality</strong></td>
					<td style="text-align: center">Self-assessment (&ldquo;looks good to me&rdquo;)</td>
					<td style="text-align: center">Deterministic gate: build, test, lint, contract check</td>
			</tr>
			<tr>
					<td style="text-align: left"><strong>Governance</strong></td>
					<td style="text-align: center">None</td>
					<td style="text-align: center">~112 rules: no emoji icons, no leaked secrets, no AI-slop patterns</td>
			</tr>
			<tr>
					<td style="text-align: left"><strong>Frontend↔Backend</strong></td>
					<td style="text-align: center">Manual validation</td>
					<td style="text-align: center">Mechanical contract check via <code>umadev-contract</code></td>
			</tr>
			<tr>
					<td style="text-align: left"><strong>Delivery</strong></td>
					<td style="text-align: center">&ldquo;Done&rdquo; (they say)</td>
					<td style="text-align: center">Scorecard, proof pack, compliance mapping</td>
			</tr>
			<tr>
					<td style="text-align: left"><strong>Learning</strong></td>
					<td style="text-align: center">Stateless each session</td>
					<td style="text-align: center">Self-evolving memory — records mistakes, reflects on recurrence</td>
			</tr>
	</tbody>
</table>
<p>But here&rsquo;s the thing — and I want to be honest about this — umadev is slower for simple tasks. If you need a quick 20-line bash script or a one-file utility, raw Claude Code is faster. The overhead of planning, reviewing, and gating only pays off when you&rsquo;re building something with actual structure.</p>
<p>It&rsquo;s also worth comparing umadev to other tools I&rsquo;ve covered on ToolGenix:</p>
<ul>
<li><strong><a href="https://github.com/zhayujie/CowAgent">CowAgent</a></strong> is a 24/7 AI butler — great for always-on chat and automation, but it&rsquo;s not a software delivery engine. CowAgent answers questions and runs tools. umadev ships projects.</li>
<li><strong><a href="/posts/metaharness-agent-harness-generator-quick-look/">MetaHarness</a></strong> generates agent scaffolds — the &ldquo;create-react-app for agent frameworks&rdquo; — but it stops at the skeleton. umadev takes you from skeleton to shipped product with governance and proof.</li>
</ul>
<h2 id="the-umadev-quality-gate-this-is-where-it-shines">The umadev Quality Gate: This Is Where It Shines</h2>
<p>Now the quality gate is umadev&rsquo;s killer feature, and here&rsquo;s why I think so.</p>
<p>After the base finishes writing code, umadev runs an independent check: build, test, lint, typecheck, contract validation, runtime probe. It doesn&rsquo;t ask the model &ldquo;is this good?&rdquo; — it checks the actual artifacts. The runtime probe starts the app and hits its routes, writing <code>runtime-proof.json</code> as evidence that the app actually responds.</p>
<p>But in my test, it caught one thing I wouldn&rsquo;t have noticed: the frontend was calling <code>/api/v1/todos</code> but the backend route was defined as <code>/api/v1/items</code>. umadev&rsquo;s contract checker found the mismatch and flagged it as a blocking finding. That&rsquo;s exactly the kind of bug that costs me 20 minutes of debugging in a normal workflow.</p>
<p>Now governance checks run on every file write — ~112 rules covering UI quality (no emoji-as-icons), security (no leaked API keys), architecture (no hardcoded colors), and language-specific patterns. Every rule is configurable in <code>.umadev/rules.toml</code>. And importantly, they&rsquo;re fail-open — a bug in the governor never blocks your work.</p>
<h2 id="what-could-be-better-in-umadev">What Could Be Better in umadev</h2>
<p>But I&rsquo;m not going to pretend this is perfect. Even so, here are the rough edges I hit:</p>
<ol>
<li>
<p><strong>Cold start is slow.</strong> The first response takes 30–60 seconds because it pre-loads the firmware (system prompt, knowledge base, repo map) into the base. After that, subsequent turns are fast, but that initial wait is noticeable.</p>
</li>
<li>
<p><strong>Heavy for small tasks.</strong> A one-line bug fix doesn&rsquo;t need a PRD and a 9-role review. umadev does scale down — the router handles this — but in practice I found the planning overhead still adds friction for trivial changes.</p>
</li>
<li>
<p><strong>The embedding model is a 224 MB download.</strong> Optional, and it degrades gracefully to BM25-only, but on a metered connection that&rsquo;s a big chunk of data for a &ldquo;quick install.&rdquo;</p>
</li>
<li>
<p><strong>Still early.</strong> 122 stars, first release June 19, 2026. The spec is solid and the architecture is Rust-grounded, but the ecosystem (community plugins, integrations, docs) is minimal. You&rsquo;re an early adopter if you jump on this now.</p>
</li>
</ol>
<h2 id="who-should-use-this">Who Should Use This</h2>
<table>
	<thead>
			<tr>
					<th style="text-align: left">If you&hellip;</th>
					<th style="text-align: left">umadev is worth a try</th>
			</tr>
	</thead>
	<tbody>
			<tr>
					<td style="text-align: left">Build full-stack apps with AI coding agents</td>
					<td style="text-align: left">✅ Especially if you&rsquo;ve hit the &ldquo;it says done but it&rsquo;s not&rdquo; wall</td>
			</tr>
			<tr>
					<td style="text-align: left">Manage a small team evaluating AI-assisted delivery</td>
					<td style="text-align: left">✅ The scorecard and proof pack make handoffs auditable</td>
			</tr>
			<tr>
					<td style="text-align: left">Want governance over AI-generated code</td>
					<td style="text-align: left">✅ 112 rules you can configure per-project</td>
			</tr>
			<tr>
					<td style="text-align: left">Need a quick script or one-file utility</td>
					<td style="text-align: left">❌ Overkill — stick with raw Claude Code or Codex</td>
			</tr>
			<tr>
					<td style="text-align: left">Deploy agent-driven workflows in production</td>
					<td style="text-align: left">✅ This is exactly the use case</td>
			</tr>
	</tbody>
</table>
<h2 id="the-bottom-line">The Bottom Line</h2>
<p>umadev doesn&rsquo;t try to replace how you write code with AI. Instead it solves a different problem — one that becomes painfully obvious the moment you try to ship something real: AI coding agents are great at writing code and terrible at managing a project.</p>
<p>It costs nothing (MIT license, open source). It works with the tools you already use (Claude Code, Codex, OpenCode). And while it&rsquo;s early — 122 stars, less than a week old at the time of writing — the engineering is grounded in a real spec, a real Rust codebase, and real delivery artifacts.</p>
<p>I&rsquo;d like to see what this project looks like in six months. But right now, if you&rsquo;re building anything beyond a one-file prototype with AI agents, umadev is worth your Sunday afternoon.</p>
<p><strong>Want to run umadev as a 24/7 team service?</strong> <em>(affiliate link)</em> You&rsquo;ll need a VPS. A $6/month <a href="/go/do">DigitalOcean Droplet</a> is plenty for the Rust binary plus a small Postgres instance. Or if you prefer multi-region coverage, <a href="/go/vultr">Vultr</a> starts at $2.50/month. Either way, the project director runs all day without your laptop.</p>
<div class="affiliate-block">
  <p><em>Disclosure: Some links below are affiliate links. If you sign up through them, I may earn a commission at no extra cost to you.</em></p>
  <ul>
    <li><a href="https://toolgenix.nxtniche.com/go/vultr" rel="nofollow sponsored" target="_blank">Vultr</a> — starts at $2.50/mo</li>
    <li><a href="https://toolgenix.nxtniche.com/go/do" rel="nofollow sponsored" target="_blank">DigitalOcean</a> — $200 credit for new users</li>
  </ul>
</div>
]]></content:encoded></item></channel></rss>