I think we've been measuring engineering teams with the wrong unit.

We count heads. Headcount is the number that goes in the board deck, the number that justifies the budget, the number a VP defends in planning. But building KarmaClock alone, a production app now live on both stores, convinced me the number that actually matters isn't how many engineers you have. It's how much leverage each one carries. And once AI-native development becomes the default skillset rather than a novelty, that leverage moves far enough that the old headcount math stops describing reality.

I want to be careful with this, because it's the kind of claim that gets flattened into "AI means fewer engineers" the moment it leaves my hands. That's not what I mean, and the distinction is the whole point. So let me show you what I actually saw.

Three hats, one person, one afternoon

KarmaClock is at v1.0.9, with light/dark theming shipped last weekend and 255 tests across a real pyramid: unit, component, integration, end-to-end. I built it solo, evenings and weekends. The instinct is to credit AI with raw speed: it writes code fast, so one person moves like several.

That's not where the leverage came from. It came from wearing three hats in a row, with no one to hand off to between them.

Start with the PM hat. The app shipped dark-only, and I wanted to give users a choice. So I asked: what would it take to offer a set of custom themes people could pick from? I got back a real level-of-effort breakdown, work itemized, time-to-market estimated. Then I did the thing a good PM does to their own idea: I challenged it. Is optionality even worth it here? The pushback was immediate and correct. Most mobile apps don't ship theme builders; they follow the native OS light or dark setting, because that's the convention users already expect. That reframed the scope on the spot. Not a custom theme engine. Two palettes that respect the phone's setting. The decision was mine, but I made it with counsel in the room.

Then the UX hat. With dark as the existing default, I needed to see light. I asked for prototypes, and in minutes I had high-fidelity HTML mockups of every screen. The quality tracked the clarity of the ask. Because I said I wanted a prototype to review and decide on, not just a picture, the work came back as side-by-side views of current versus proposed, plus a legend of the exact color palette in use, so I had a reference while I reviewed. None of that was magic. It was a clear request from someone who knew what "done" looked like. A few rounds of tweaking colors, where that palette legend earned its keep, a regeneration each time, and the design was locked. Six screens. Under twenty-five minutes.

Sit with that number against the conventional version. Six screens of light-mode design, ironed out properly, is realistically three to five calendar days for a small team: a UX designer at most of a workday, a PM and an engineer giving a few hours each to review and sanity-check feasibility, and that's before the feedback loops, which is where it actually balloons. Every "can we shift that accent a step darker" is another async round of revise, re-review, re-check. Call it forty-odd person-hours spread across most of a week. That is the conventional creative mixer, and it is not efficient. When the mockups locked here, the requirements were done, because the design was the requirement.

I'm describing these as three clean hats, but that's tidier than it felt. The PM and UX work blurred into each other constantly. "Should we even offer a toggle?" is a product question and a design question in the same breath, and when one person holds both seats, the boundary between them just dissolves. That blur isn't sloppiness. It's the point. The seams between roles are where coordination cost usually hides.

Then the engineer hat. I reviewed the locked design, asked for a plan to update the specs to match, reviewed the plan, locked it, and then asked for any clarifying questions before we touched code, so the ambiguity got surfaced up front instead of discovered halfway through. Execution ran in parallel. The coding was the short part.

Start to finish, problem framing through committed code, was about two and a half hours, one person. Then I triggered the Android and iOS builds and waited twenty-odd minutes for the machines to compile two native binaries. I want to be precise about what was slow and why, because the easy version of this story is wrong. The build wasn't the long pole. The build is a fixed cost I don't control and shouldn't try to: it takes what it takes, it doesn't need me, and if I'd made a mistake I'd just run it again. The actual long pole was the human part, the ideation and the three-hat switch. That was the most demanding stretch of the afternoon.

But here is the whole point: that long pole, the slow, valuable, irreducible creative work, still beat the conventional version of the same work by an order of magnitude. Two and a half hours against the better part of a week. One person against three. I didn't eliminate the creative loop.

I compressed it.

We are very good at imagining AI as the thing that types code fast. The bigger shift is that it collapses the expensive human loop around the typing, the part that conventionally sprawls across a week of calendars, down to an afternoon.

Where this lands

Notice what didn't collapse: the work. PM, UX, and engineering all still happened, with real rigor at each step. What shrank toward zero was everything between them. No ticket sitting in a queue waiting for the designer. No spec lost in translation from product to engineering. No meeting to align three people on what "done" means, because the three were one person moving from seat to seat with the context intact the whole way. That gap between roles, the coordination tax that quietly eats most of a team's calendar, is the thing AI-native development actually attacks. Not the roles. The seams.

The future everyone keeps predicting is already usable today: less detailed sprint planning, more collaborative back-and-forth on a tight loop. You still need all the judgment, in every role. You just stop paying the tax to move between them.

So when I say a high-performing, AI-native team is a higher-leverage team, and that in that context less is more, I don't mean people are waste. I mean the opposite. The unit that matters stops being headcount and becomes talent density: fewer people, each carrying more of the loop, each more valuable, not less. I mean small pods where the feedback cycle runs in minutes where the old process ran in sprints. If you're a startup, that's the difference between staying lean long enough to learn whether the market cares and hiring a department before you've earned the right to. If you're a mature product company, it's harder and more interesting: doing more with less isn't a headcount cut, it's redesigning the whole assembly line so the loop is actually that fast. The cliché finally has teeth.

I don't think this generalizes cleanly to every team or every codebase, and I suspect it breaks in ways I haven't hit yet, especially on systems I didn't build from scratch. That's the harder question, and it's where I'm headed next.

For now I'll just say the thing I keep coming back to: I've stopped asking how many engineers a problem needs. I've started asking how much leverage the right ones would carry. Those turn out to be very different questions.

What's the work on your team that only exists because of coordination, and what happens to it when the coordination cost goes to zero?

Keep Reading