The Week Finance Got Its Agent Stack

For two years I’ve been saying the same thing about AI agents: the technology is coming, but it’s not here yet. Too many vendors were slapping the word “agent” on what amounted to a chatbot with a calendar invite. I had a simple test: does the thing run a conditional loop? Can it take a task, evaluate the output, decide what to do next, and keep working until the job is done? Most so-called agents couldn’t pass that bar.

I’m covering them now. The agents are here. The models got meaningfully better at reasoning. Claude Opus 4.5 and 4.6, GPT-5.2 and 5.3, Gemini 3 all ship with chain-of-thought reasoning built in. Previous “agents” were auto-complete running in a loop. These actually evaluate context, weigh rules, and decide what to do next. Inference costs dropped roughly 280x for equivalent model performance since late 2022. Context windows expanded to 1M tokens, enough to hold an entire chart of accounts and a month of transactions in working memory.

Goldman Sachs partnered with Anthropic to build custom agents across trade accounting, compliance checks, transaction reconciliation, and client vetting. Goldman’s CIO called Claude’s reasoning “surprisingly capable” at rules-based accounting tasks beyond coding. When a bank that size starts replacing vendor contracts with in-house agents, the rest of the industry notices. Ramp’s Accounting Agent auto-codes every expense across GL account, department, class, location, and custom fields, reviewing 100% of spend against policy. It learns from corrections, so the model gets sharper with each close cycle. Teams report a 3x faster monthly close. For context, only 2% of finance teams currently use AI as their primary method for coding and posting transactions.

Oracle NetSuite shipped Autonomous Close: exception management, close tracking, flux analysis, and AI-powered bank matching. The system even identifies missing transactions and dynamically accrues them as pro forma entries. HPE’s “Alfred,” built with Deloitte, is scaling from pilot to production across credit, collections, AP/AR, and forecasting.

The architecture pattern across all four is roughly the same: an ingestion layer pulls data from ERPs and bank feeds, a reasoning layer applies rules and learned patterns, an action layer codes transactions and flags exceptions, and an escalation layer determines what gets auto-synced versus what needs human sign-off. 54% of CFOs say agent integration is their top digital transformation priority for 2026. But 86% who’ve deployed agents encountered hallucinated data, and only 21% have mature agent governance.

Three takeaways: the buy/build/DIY spectrum just got real, agent governance is the new SOX readiness (start with an AI governance policy), and if you’re choosing a pilot, start with the close. The Crawl-Walk-Run framework maps the sequence. Every vendor this week targeted the same workflow cluster: transaction coding, reconciliation, close management. That’s where agents deliver the fastest ROI and where the control framework is easiest to define.