The AI Budget Paradox for CFOs

A working paper landed at NBER this month with data you can actually work with. Yotzov, Barrero, Bloom, and coauthors (w34836) surveyed roughly 6,000 executives across the United States, the United Kingdom, Germany, and Australia. Sixty-nine percent of firms use AI in some form. Executives personally spend about an hour and a half a week on it. Nine in ten report no realized impact over the past three years. In the same questionnaire, the same executives forecast plus 1.4 percent productivity and minus 0.7 percent employment over the next three years.

Baslandze, Edwards, and Graham, in a companion NBER paper (w34984) published the same week, put a name on the pattern. They call it a new productivity paradox. Perceived gains exceed measured gains, and the gap shows up inside the same firms, often inside the same people's heads. The finance version of the question writes itself. How do you govern a capex program where the trailing three years show nothing and the forward three years are priced as a foregone conclusion?

CFOs who have run a decade of technology programs already know this shape. The vendor demo shows a 40 percent productivity lift. The pilot lands somewhere between zero and measurable. The board budget keeps the 40 percent number as the reference point anyway, because nobody wants to reprice the project. Sunk-cost behavior runs through technology investments the same way it runs through M&A. The difference with AI is that the forecast-to-reality spread is wider and the underlying category is still new enough that almost nobody has a reliable benchmark.

A few things push executive forecasts above realized results. One is exposure asymmetry. Executives see the vendor's best demos and the consulting deck's best case studies. Line users see the jagged frontier. Another is attribution slippage. When AI ships alongside a workflow redesign, the productivity gain gets credited to AI even when the real work was the process rebuild. A third is the optimism layer that sits over any forward-looking survey. Nobody fills in a forecast of minus one percent productivity on an investor-facing document. The same executives who saw no realized impact still reached for a plus 1.4.

The spread is pricing information. Size the capex program to realized gains closer to the field studies than to the lab demos. The Karger and Tetlock forecasting tournament (NBER w35046), also published this month, lands a median 2.5 percent US GDP growth through 2030 with a rapid-AI scenario of 4 percent. The IMF April scenario note stays in the same band. Acemoglu's 0.7 percent TFP floor sits underneath. Those are the numbers your capex program should stress-test against, not the vendor slide.

Three checks flag early whether your AI program is running on realized value or on executive forecast. First, ask whether any line manager in your organization can point to a workflow where measured throughput increased. If the answer is a vendor slide rather than a named process with a before-and-after cycle time, the program is optimistic. Second, reconcile executive time spent on AI with the value attributed to it. An hour and a half a week is a measurable input. If the attributed value is twenty times that on a run-rate basis, you're funding a narrative. Third, pull the capex justification memo and circle every productivity number that came from a controlled experiment. Discount each one by the field-to-lab ratio, which sits somewhere near a tenth.

MIT Project NANDA's number from last year cuts the same way. Ninety-five percent of enterprise generative AI pilots produced no measurable business impact. The 5 percent that worked had redesigned the workflow around the tool rather than bolting the tool onto the old workflow. Executives in the Yotzov-Barrero-Bloom survey reporting realized impact are almost certainly concentrated in that 5 percent. The finance teams that want to move into it have to treat AI investment as an organizational program, not a software purchase.

The gap between forecast and reality is the market telling you the technology is real and the timing is uncertain. Both of those statements are useful. Treat expected gains as a scenario. Size the budget to what has already been realized. Run both sides through the same finance discipline you would apply to any other capex program. The paradox resolves as workflows redesign around the tool. Until that redesign happens at scale, a CFO's job is to keep the forecast and the realization close enough that the board can read both without a translation layer.