Pull the Lever

What happens when one AI-augmented engineer replaces nine

Mar 28, 2026

An AI-augmented engineer can produce the output of ten. Neither the engineer nor the employer has a compensation or risk model for that — and the side that figures it out first wins.

Pull the Lever

Here is a receipt. I built a pipeline that writes code from specifications — I describe what the software should do, the pipeline generates it, tests it, and commits it. The test project is a reimplementation of standard Unix utilities in Go — a finite scope with well-defined requirements. Over the past six weeks, on my own time, that pipeline produced 320,000 lines of code across 46 experiments. The first run produced 159 lines. The most recent produced 73,805. Same AI model, same token budget. The difference was the pipeline getting smarter, not the AI. Give this toolchain to someone with a bigger problem and the numbers scale with the ambition.

The Risk Asymmetry

Two things are happening at the same time. Some organizations are embracing AI tooling — investing in infrastructure, training engineers, building pipelines. Others are banning the tools, restricting access, and waiting to see what happens. Twenty-seven percent of organizations have banned generative AI tools entirely [1]. Among development teams at those companies, 99% of developers use the tools anyway — they just do it outside the firewall [2].

The engineers at the slow organizations are not uninformed. They can see what the tools do. They understand the cost of not learning them. They watch companies that have embraced the tools ship faster, iterate more, and compound capability while their own employer debates procurement policy. So they learn on their own time, build on their own machines, and develop skills their employer will not invest in — while still benefiting from the output those skills produce.

The engineer’s risk has three parts.

The downside is the engineer’s. The company may fail. A restructuring may eliminate your role. If management misreads the market or botches the execution, your position disappears regardless of how much code you shipped. You bear the consequences of decisions you did not make.

The upside is the employer’s. Your salary is fixed. No bonus tied to the volume you produced or the velocity you enabled. If the company ships faster because one engineer is producing the output of ten, that engineer receives the same paycheck they received when they were producing the output of one.

The employer who embraces the tools captures the output directly. The employer who bans them does not — but still absorbs the architectural thinking that the engineer developed on personal time using the banned tools. Neither employer has adjusted compensation.

The difference is that the engineer at the organization that banned the tools can see what the other side looks like. They know what the tools can do. They know their employer chose not to learn. And they know that the organizations that did embrace the tools are hiring. An engineer with AI-augmented capability is not going to stay indefinitely at an organization that pretends the capability does not exist — or moves too slowly to adopt best of breed before the competition does.

The Employer’s Arithmetic

An engineer who can generate, evaluate, and refine 50,000 lines of working code inside of 24 hours operates at a different pace than an organization that takes six months to approve a cloud vendor. This is not a judgment about organizational competence. Large organizations move slowly for reasons — compliance, security, coordination costs, risk management. Those reasons are real. They are also irrelevant to the arithmetic.

Based on what I have observed building my own pipeline, it is now possible to reliably produce 73,000 lines of working code over night. At that rate — one run per day, single-threaded, on a single subscription — the pipeline produces a million lines every two weeks. More money for tokens speeds this up. But, the constraint is not the tool or even the token budget. It is the imagination of the person holding it.

Now run the employer’s arithmetic. You have a team of ten engineers. One of them develops AI-augmented capability that produces the output of the other nine. The fully loaded cost of AI tooling — API tokens, subscriptions, compute — runs 5,000−5,000−10,000 per month. Add that to the engineer’s salary and you are paying the equivalent of two engineers. For the cost of two, you get the output of ten. You could “pull the lever,” let nine go, and save 80%. Someone told me recently that this was their strategy to make their company profitable.

However, that lever is short-sighted and has a cost the spreadsheet does not capture. That one engineer is now a single point of failure for the output of ten. If they get hit by a bus, burn out, or leave — and they will be recruited aggressively, because everyone wants the engineer who can produce a team’s volume — the organization loses the output of ten people overnight. If you already “pulled the lever”, the other nine engineers are already gone. And nobody remaining knows what is in the codebase — because the code was never reviewed by nine other pairs of eyes, the architectural decisions live in one person’s head, and the specifications that drive the code generation are documentation that nobody else has read.

The obvious counterargument is that the AI can read the code. When the engineer leaves, the replacement can use the same tools to understand the architecture, suggest changes, make modifications. The codebase is not locked in someone’s head — it is in the repository, and AI is good at reading repositories.

Except that AI is sometimes wrong. I thought up this article because the AI miscounted its own output by 6x. Is that the tool you want a new hire — someone with zero context on the system — relying on to make changes to a codebase nobody else has ever reviewed? Institutional knowledge is hard to build and easy to lose. AI does not replace it. AI accelerates the person who has it. Without that person, AI accelerates the mistakes.

An organization that “pulls the lever” and replaces nine engineers with one AI-augmented engineer and a token budget has not reduced headcount. It has concentrated its risk. Unintentionally, this organization is appreciating the value of the engineers. The organization’s ability to absorb it — or survive its departure — is not.

Every organization used to have the one person who knew how to maintain the Excel macros. Nobody understood what the macros did. Nobody could replace that person. Management resented the dependency but tolerated it because the spreadsheets had to work.

Now replace the Excel macros with an entire codebase. The AI-augmented engineer is that person, except the dependency is ten times deeper. The macros were a few thousand lines of VBA. The codebase is hundreds of thousands of lines that the engineer produced using tools and techniques the organization does not understand [3] [4]. The organization cannot evaluate the output. The organization cannot replicate the capability. The balance of power has shifted to the engineer and the employer has not noticed — because the org chart still shows the same reporting line and the same salary band.

When the AI Miscounted Its Own Output

I asked the AI to count how many lines of code it had generated. It told me 1.87 million. I believed it. I put that number in a published article. I felt like a superstar. One kind of obvious lesson here is don’t write substack articles while on a beach vacation…

Nevertheless, my gut said the number was wrong. Not a calculation — a feeling. Twenty years of shipping production systems builds an intuition for when numbers do not pass the smell test. So I asked the AI to write a script to actually check. It wrote the script. The script checked out every generation tag, counted the Go files, summed the lines. The real number was 320,000. The AI had counted git churn — insertions, deletions, rewrites of the same files across 46 experiments — instead of unique lines produced. It was not lying. It was confidently wrong. And it took a human with domain experience to notice.

This is worth sitting with. The same tool that generated 320,000 lines of working code could not accurately count them. It wrote the verification script that proved itself wrong — but only because an experienced engineer knew to ask. Anyone planning to consolidate engineering headcount around AI-augmented output should consider what else the tool is confidently wrong about — and who will catch it when the engineer who used to catch it has been laid off.

That correction also taught me something about myself. 320,000 lines of production code in six weeks, part-time, on a $250 subscription — and my first reaction was disappointment. Not pride. Disappointment. My expectations had recalibrated so completely that generating more code in six weeks than most ten-person teams produce in a year [13] felt like underperformance.

That recalibration is the capability gap in miniature. The numbers are large enough to be meaningless in conversation. I have tried sharing them. The response is usually silence, or a polite change of subject. I do not blame anyone for that. The gap between “maybe AI works, maybe it doesn’t” and “here are the receipts” is not a gap you close in conversation. It is a gap in lived experience.

The value capture problem is visible from both sides. Not every engineer can do this — it requires someone who knows how to architect systems that agents can build, who can write specifications precise enough to generate from, and who has the domain experience to catch the tool when it is wrong. But the engineers who develop that combination produce volume that used to require a team. The salary structure at a large company is designed for one person producing one person’s output. The compensation model does not have a box for this — and neither does the risk model.

What the Spreadsheet Teaches

This is not the first time a productivity tool created a value capture problem.

When VisiCalc shipped in 1979, it made financial calculations that took days reducible to minutes. The United States had 339,000 accountants and accounting clerks in 1980. By 2022, there were 1.4 million accountants and auditors [6]. The profession did not shrink. It expanded — because accounting became cheaper, so people demanded more of it.

But look at who captured the value. The 400,000 bookkeeping and accounting clerk jobs — the ones doing the arithmetic VisiCalc automated — disappeared. The 600,000 new jobs were for analysts, auditors, and advisors doing higher-value work. The profession grew. The clerks lost. The firms captured the efficiency gains.

The pattern is consistent. Productivity tools expand what is possible. The people doing the newly-possible work benefit. The people whose existing work was automated do not. And the organizations that deploy the tools capture the surplus.

The AI version of this pattern is playing out now, but with a twist. Some organizations are deploying the tools and capturing the productivity gains directly. Others are not — but their engineers are learning the tools on their own time, and the productivity gains show up anyway in the form of better designs, faster iteration, and sharper architectural decisions. In both cases, the firms capture the surplus. The engineer’s compensation has not changed.

The Productivity Paradox

The research on AI-augmented productivity is contradictory in a way worth examining.

GitHub’s controlled study found developers completed tasks 55.8% faster with Copilot [7]. Brynjolfsson, Li, and Raymond studied 5,179 customer service agents and found a 14% average productivity increase — but a 34% increase for novice workers and minimal gains for experienced ones [8]. Noy and Zhang found that ChatGPT compressed the productivity distribution: lower-ability workers improved the most, and inequality between workers decreased [9].

Then METR ran a randomized controlled trial with experienced open-source developers — people maintaining repos with 22,000+ stars and over a million lines of code. Those developers took 19% longer with AI tools. They believed AI had sped them up by 20% — a 39-percentage-point perception gap [10].

The pattern across these studies is consistent. AI compresses the skill distribution. It lifts the floor. It does not raise the ceiling. Novices improve. Experts do not — or get slower.

Except that the METR study measured experienced developers using AI the way most people use AI — as an assistant within existing workflows. It did not measure what happens when an experienced developer redesigns the workflow around the tool — when the developer builds orchestration pipelines, manages context budgets, designs verification layers, and treats the AI as infrastructure rather than a chat partner.

That is a different capability. And the data on it does not exist yet, because the population that has developed it is small. The question of whether that population stays small — or whether every developer develops this capability within a year — is the commoditization question.

Will This Capability Hold Its Value?

Two forces pull in opposite directions.

The first force is that the tools get better. Each model release is more capable than the last. The gap between a developer with six months of orchestration experience and one with eighteen months narrows. If the model or coding agent handles more of the planning, the decomposition, the context management — then the orchestration skill depreciates. The capability commoditizes. Everyone can do it.

The second force is that the ceiling rises. As the tools improve, what is possible expands. The engineer who already knows how to architect systems that agents can build — who understands verification pipelines, layered construction, context budgets — can do more with better tools, not the same thing more easily. Better models do not eliminate the need for someone who knows what to build and in what order. They amplify that person’s output.

Which force wins depends on what the capability actually is.

If the capability is “knowing how to prompt Claude” — that is commoditized tomorrow. Every model update makes prompting easier. The skill depreciates with each release.

If the capability is “knowing how to architect systems that agents can build reliably” — that is systems thinking applied to a new medium. Systems thinking has never been commoditized. The medium changes. The need for someone who can decompose a problem, define the construction order, and verify the output does not.

I do not know which force wins. But I know which bet I would make. The engineers who invest now in learning how to architect for AI-assisted generation — not just how to prompt, but how to specify, decompose, verify, and iterate — are building a skill that compounds. Every improvement in the model makes them more productive, not less relevant. The engineers who wait for the tools to get easy enough that everyone can do it are betting that the ceiling stays where it is.

The Compensation Problem

Here is a pattern. An engineer develops AI-augmented capability on their own time, with their own tools. The employer benefits from the thinking that capability produces. The compensation does not change. The employment risk does not change. The company may fail or succeed. Either way, the engineer has a salary — until they don’t.

From the employer’s side, the pattern looks different. AI captured roughly half of all global venture capital in 2025 — $202 billion, up 75% from the prior year [11]. In Y Combinator’s most recent batch, 88% of startups were AI-native, and 30% of founders came from large technology companies [12]. The engineers with this capability are in demand. The question is not whether they will be recruited. The question is what the current employer offers that makes staying rational.

Paying 1x salary for 10x output sounds like a bargain until the person producing that output recognizes the arithmetic. You cannot pay someone the same rate when their tools have changed the unit economics of their labor by an order of magnitude — not because it is unfair, but because the market will not let you. Someone else will pay more.

The structural problem is that neither side has a model for this. The employer does not know how to evaluate output it cannot review, retain talent it cannot replace, or absorb risk it has not measured. But the engineer who has developed this capability is not powerless. They have leverage — real, demonstrable, quantifiable leverage. The receipts exist. The git history exists. The gap between what they produce and what the compensation model assumes is the negotiating position.

The engineers who recognize this early — who build the capability, document the output, and understand their own value before the market catches up — will be in a position to name their terms. The ones who wait for their employer to notice will be waiting a long time.

That world where one person produced one person’s output is over. The question is whether you are building the capability that gives you a seat at the table when the new model arrives — or whether you are hoping someone else figures it out for you.

REFERENCES

[1] Cisco (2024). 2024 Data Privacy Benchmark Study. Cisco Systems.

[2] The Decoder (2024). “15% of Companies Ban Code AI, But 99% of Developers Use It Anyway.”

[3] Jensen, M.C. and Meckling, W.H. (1976). “Theory of the Firm: Managerial Behavior, Agency Costs and Ownership Structure.” Journal of Financial Economics, 3(4), 305-360.

[4] Jarrahi, M.H. and Ritala, P. (2025). “Rethinking AI Agents: A Principal-Agent Perspective.” California Management Review.

[5] Fortune (2023). “Samsung, Apple, and Wall Street Banks Have Banned ChatGPT.” Fortune Media.

[6] Harford, T. (2024). “What the Birth of the Spreadsheet Teaches Us About Generative AI.” Financial Times / timharford.com.

[7] Peng, S. et al. (2023). “The Impact of AI on Developer Productivity: Evidence from GitHub Copilot.” arXiv:2302.06590.

[8] Brynjolfsson, E., Li, D., and Raymond, L. (2023). “Generative AI at Work.” NBER Working Paper 31161. Quarterly Journal of Economics, 2025, 140(2), 889-942.

[9] Noy, S. and Zhang, W. (2023). “Experimental Evidence on the Productivity Effects of Generative Artificial Intelligence.” Science, 381(6654).

[10] METR (2025). “Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity.” metr.org.

[11] Crunchbase (2025). “AI Captured Nearly 50% of All Global VC Funding in 2025.” Crunchbase News.

[12] Extruct AI (2025). “Analysis of Y Combinator S25 Batch Composition.” extruct.ai/research.

[13] McConnell, S. (2004). Code Complete, 2nd edition. Microsoft Press. COCOMO data shows 1,000-20,000 LOC per staff-year for 100K+ LOC projects; a 10-person team averages roughly 26,000 LOC/year.

Petar Djukic

Discussion about this post

Ready for more?