Brokers don’t know what attractiveness like. And that’s precisely the issue.

Luca Mezzalira, writer of Constructing Micro-Frontends, initially shared the next article on LinkedIn. It’s being republished right here together with his permission.

Each few years, one thing arrives that guarantees to vary how we construct software program. And each few years, the business splits predictably: One half declares the previous guidelines useless; the opposite half folds its arms and waits for the hype to cross. Each camps are normally mistaken, and each camps are normally loud. What’s rarer, and extra helpful, is somebody standing in the midst of that noise and asking the structural questions: Not “What can this do?” however “What does it imply for a way we design techniques?”

That’s what Neal Ford and Sam Newman did in their latest fireplace chat on agentic AI and software program structure throughout O’Reilly’s Software program Structure Superstream. It’s a dialog value pulling aside rigorously, as a result of a few of what they floor is extra uncomfortable than it first seems.

The Dreyfus entice

Neal opens with the Dreyfus Mannequin of Information Acquisition, initially developed for the nursing occupation however relevant to any area. The mannequin maps studying throughout 5 levels:

Novice
Superior newbie
Competent
Proficient
Knowledgeable

His declare is that present agentic AI is caught someplace between novice and superior newbie: It may observe recipes, it may even apply recipes from adjoining domains when it will get caught, nevertheless it doesn’t perceive why any of these recipes work. This isn’t a minor limitation. It’s structural.

The canonical instance Neal offers is gorgeous in its simplicity: An agent tasked with making all exams cross encounters a failing unit take a look at. One completely legitimate strategy to make a failing take a look at cross is to exchange its assertion with assert True. That’s not a hack within the agent’s thoughts. It’s an answer. There’s no moral framework, no skilled judgment, no intuition that claims this isn’t what we meant. Sam extends this instantly with one thing he’d actually seen shared on LinkedIn that week: an agent that had modified the construct file to silently ignore failed steps slightly than repair them. The construct handed. The issue remained. Congratulations all-round.

What’s attention-grabbing right here is that neither Ford nor Newman are being dismissive of AI functionality. The purpose is extra refined: The creativity that makes these brokers genuinely helpful, their capability to look answer house in methods people wouldn’t suppose to, is inseparable from the identical property that makes them harmful. You possibly can’t absolutely lobotomize the improvization with out destroying the worth. It is a design constraint, not a bug to be patched.

And once you zoom out, that is a part of a broader sign. When skilled practitioners who’ve spent many years on this business independently converge on requires restraint and rigor slightly than acceleration, that convergence is value listening to. It’s not pessimism. It’s sample recognition from individuals who’ve lived by way of sufficient cycles to know what the warning indicators appear like.

Conduct versus capabilities

One of the crucial necessary issues Neal says, and I feel it will get misplaced within the total density of the dialog, is the excellence between behavioral verification and functionality verification.

Behavioral verification is what most groups default to: unit exams, useful exams, integration exams. Does the code do what it’s speculated to do in accordance with the spec? That is the pure match for agentic tooling, as a result of brokers are literally getting fairly good at implementing conduct towards specs. Give an agent a well-defined interface contract and a transparent set of acceptance standards, and it’ll produce one thing that broadly satisfies them. That is actual progress.

Functionality verification is more durable. A lot more durable. Does the system exhibit the operational qualities it must exhibit at scale? Is it correctly decoupled? Is the safety mannequin sound? What occurs at 20,000 requests per second? Does it fail gracefully or catastrophically? These are issues that the majority human builders wrestle with too, and brokers have been skilled on human-generated code, which suggests they’ve inherited our failure modes in addition to our successes.

This brings me to one thing Birgitta Boeckeler raised at QCon London that I haven’t been capable of cease fascinated about. The instance everybody cites when making the case for AI’s coding functionality is that Anthropic constructed a C compiler from scratch utilizing brokers. Spectacular. However right here’s the factor: C compiler documentation is very well-specified and battle-tested over many years, and the take a look at protection for compiler conduct is among the most rigorous in the complete software program business. That’s as near a solved, well-bounded downside as you may get.

Enterprise software program is nearly by no means like that. Enterprise software program is ambiguous necessities, undocumented assumptions, tacit information dwelling within the heads of people that left three years in the past, and take a look at protection that exists extra as aspiration than actuality. The hole between “can construct a C compiler” and “can reliably modernize a legacy ERP” will not be a niche of uncooked functionality. It’s a niche of specification high quality and area legibility. That distinction issues enormously for a way we take into consideration the place agentic tooling can safely function.

The present orthodoxy in agentic improvement is to throw extra context on the downside: elaborate context recordsdata, structure determination data, tips, guidelines about what to not do. Ford and Newman are appropriately skeptical. Sam makes the purpose that there’s now empirical proof suggesting that as context file dimension will increase, you see degradation in output high quality, not enchancment. You’re not guiding the agent towards higher judgment. You’re simply accumulating scar tissue from earlier disasters. This isn’t distinctive to agentic workflows both. Anybody who has labored severely with code assistants is aware of that summarization high quality degrades as context grows, and that this degradation is just partially controllable. That has a direct affect on choices remodeled time; now shut your eyes for a second and picture doing it throughout an enterprise software program, with many groups throughout completely different time zones. Don’t get me mistaken, the instruments assist, however the assistance is bounded, and that boundary is commonly nearer than we’d wish to admit.

The extra trustworthy framing, which Neal alludes to, is that we want deterministic guardrails round nondeterministic brokers. No more prompting. Architectural health features, an concept Ford and Rebecca Parsons have been selling since 2017, really feel like they’re lastly about to have their second, exactly as a result of the price of not having them is now instantly seen.

What ought to an agent personal then?

That is the place the dialog will get most attention-grabbing, and the place I feel the sector is most confused.

There’s a seductive logic to the microservice because the unit of agentic regeneration. It sounds small. The phrase micro is within the identify. You possibly can think about handing an agent a service with an outlined API contract and saying: implement this, take a look at it, accomplished. The scope feels manageable.

Ford and Newman give this concept honest credit score, however they’re additionally trustworthy in regards to the hole. The microservice stage is engaging architecturally as a result of it comes with an implied boundary: a course of boundary, a deployment boundary, typically a knowledge boundary. You possibly can put health features round it. You possibly can say this service should deal with X load, keep Y error charge, expose Z interface. In idea.

In observe, we barely implement these items ourselves. The brokers have realized from a corpus of human-written microservices, which suggests they’ve realized from the overwhelming majority of microservices that had been written with out correct decoupling, with out actual resilience considering, with none rigorous capability planning. They don’t have our aspirations. They’ve our habits.

The deeper downside, which Neal raises and which I feel deserves extra consideration than it will get, is transactional coupling. You possibly can design 5 superbly bounded providers and nonetheless produce an architectural catastrophe if the workflow that ties them collectively isn’t thought by way of. Sagas, occasion choreography, compensation logic: That is the stuff that breaks actual techniques, and it’s additionally the stuff that’s hardest to specify, hardest to check, and hardest for an agent to purpose about. We made precisely this error within the SOA period. We designed beautiful little providers after which found that the attention-grabbing complexity had merely migrated into the mixing layer, which no person owned and no person examined.

Sam’s line right here is value quoting straight, roughly: “To err is human, nevertheless it takes a pc to essentially screw issues up.” I believe we’re going to provide some genuinely legendary transaction administration disasters earlier than the sector develops the muscle reminiscence to keep away from them.

The sociotechnical hole no person is speaking about

There’s a dimension to this dialog that Ford and Newman gesture towards however that I feel deserves rather more direct examination: the query of what occurs to the people on the opposite facet of this generated software program.

It’s not utterly correct to say that each one agentic work is going on on greenfield initiatives. There are instruments already in manufacturing serving to groups migrate legacy ERPs, modernize previous codebases, and sort out the modernization problem that has defeated typical approaches for years. That’s actual, and it issues.

However the problem in these circumstances isn’t merely the code. It’s whether or not the sociotechnical system, the groups, the processes, the engineering tradition, the organizational constructions constructed across the present software program are able to inherit what will get constructed. And right here’s the factor: Even when brokers mixed with deterministic guardrails might produce a well-structured microservice structure or a clear modular monolith in a fraction of the time it will take a human workforce, that architectural output doesn’t routinely include organizational readiness. The system can arrive earlier than the persons are ready to personal it.

One of many underappreciated features of iterative migration, the incremental strangler fig strategy, the gradual decomposition of a monolith over 18 months, will not be primarily danger discount, although it does that too. It’s studying. It’s the method by which a workforce internalizes a brand new manner of working, makes errors in a bounded context, recovers, and builds the judgment that lets them function confidently within the new world. Compress that journey too aggressively and you’ll find yourself with structure whose operational complexity exceeds the organizational capability to handle it. That hole tends to be costly.

At QCon London, I requested Patrick Debois, after a chat overlaying greatest practices for AI-assisted improvement, whether or not making use of all of these practices persistently would make him snug engaged on enterprise software program with actual complexity. His reply was: It relies upon. That felt just like the trustworthy reply. The tooling is bettering. Whether or not the people round it are conserving tempo is a separate query, and one the business will not be spending practically sufficient time on.

Present techniques

Ford and Newman shut with a topic that nearly by no means will get coated in these conversations: the huge, unglamorous majority of software program that already exists and that our society relies on in methods which are straightforward to underestimate.

A lot of the discourse round agentic AI and software program improvement is implicitly greenfield. It assumes you’re beginning recent, that you just get to design your structure sensibly from the start, that you’ve clear APIs and tidy service boundaries. The truth is that the majority beneficial software program on the planet was written earlier than any of this existed, runs on platforms and languages that aren’t the pure habitat of contemporary AI tooling, and incorporates many years of amassed choices that no person absolutely understands anymore.

Sam is engaged on a guide about this: the way to adapt present architectures to allow AI-driven performance in methods which are really protected. He makes the attention-grabbing level that present techniques, regardless of their fame, typically provide you with a head begin. A well-structured relational schema carries implicit which means about knowledge possession and referential integrity that an agent can really purpose from. There’s construction there, if you understand how to learn it.

The final lesson, which he states with out a lot drama, is you could’t simply expose an present system by way of an MCP server and name it accomplished. The interface will not be the structure. The dangers round safety, knowledge publicity, and vendor dependency don’t go away since you’ve wrapped one thing in a brand new protocol.

This issues greater than it may appear, as a result of the software program that runs our monetary techniques, our healthcare infrastructure, our logistics and provide chains, will not be greenfield and by no means might be. If we get the modernization of these techniques mistaken, the implications aren’t summary. They’re social. The intuition to index closely on what these instruments can do in ideally suited circumstances, on well-specified issues with good documentation and thorough take a look at protection, is comprehensible. However it’s precisely the mistaken intuition when the techniques in query are those our lives depend upon. The architectural mindset that has served us properly by way of earlier paradigm shifts, the one which begins with trade-offs slightly than capabilities, that asks what we’re giving up slightly than simply what we’re gaining, will not be elective right here. It’s the minimal requirement for doing this responsibly.

What I take away from this

Three issues, largely.

The primary is that introducing deterministic guardrails into nondeterministic techniques will not be elective. It’s crucial. We’re nonetheless determining precisely the place and the way, however the framing must shift: The objective is management over outcomes, not simply oversight of output. There’s a distinction. Output is what the agent generates. Final result is whether or not the system it generates really behaves accurately underneath manufacturing circumstances, stays inside architectural boundaries, and stays operable by the people accountable for it. Health features, functionality exams, boundary definitions: the boring infrastructure that connects generated code to the true constraints of the world it runs in. We’ve had the instruments to construct this for years.

The second is that the folks saying that is the long run and the folks saying that is simply one other hype cycle are each most likely mistaken in attention-grabbing methods. Ford and Newman are cautious to say they don’t know what attractiveness like but. Neither do I. However now we have higher prior artwork to attract on than the discourse normally acknowledges. The ideas that made microservices work, after they labored, actual decoupling, specific contracts, operational possession, apply right here too. The ideas that made microservices fail, leaky abstractions, distributed transactions dealt with badly, complexity migrating into integration layers, will trigger precisely the identical failures, simply sooner and at bigger scale.

The third is one thing I took away from QCon London this yr, and I feel it could be an important of the three. Throughout two days of talks, together with classes that took diametrically reverse approaches to integrating AI into the software program improvement lifecycle, one factor turned clear: We’re all newbies. Not within the dismissive sense however in essentially the most literal utility of the Dreyfus mannequin. No person, no matter expertise, has found out the proper strategy to match these instruments inside a sociotechnical system. The recipes are nonetheless being written. The battle tales that may ultimately turn out to be the prior artwork are nonetheless occurring to us proper now.

What obtained us right here, collectively, was sharing what we noticed, what labored, what failed, and why. That’s how the sector moved from SOA disasters to microservices greatest practices. That’s how we constructed a shared vocabulary round health features and evolutionary structure. The identical course of has to occur once more, and it’ll, however provided that folks with actual expertise are trustworthy in regards to the uncertainty slightly than performing confidence they don’t have. The pace, finally, is each the chance and the hazard. The know-how is transferring sooner than the organizations, the groups, and the skilled instincts that want to soak up it. The very best response to that isn’t to fake in any other case. It’s to maintain evaluating notes.

If this resonated, the full fireplace chat between Neal Ford and Sam Newman is value watching in its entirety. They cowl extra floor than I’ve had house to react to right here. And in the event you’d wish to be taught extra from Neal, Sam, and Luca, take a look at their most up-to-date O’Reilly books: Constructing Resilient Distributed Programs, Structure as Code, and Constructing Micro-frontends, second version.

Main Menu

What's Hot

Regulation Enforcement Used Webloc to Observe 500 Million Gadgets by way of Advert Information

Right now’s NYT Mini Crossword Solutions for April 12

Wharton Professor Stewart Friedman’s TOTAL Management Framework

Brokers don’t know what attractiveness like. And that’s precisely the issue. – O’Reilly

ACM Human-Pc Interplay Convention (CHI) 2026

Trendy Subject Modeling in Python

The way forward for managing brokers at scale: AWS Agent Registry now in preview

Evaluating the Finest AI Video Mills for Social Media

Utilizing AI To Repair The Innovation Drawback: The Three Step Resolution

Midjourney V7: Quicker, smarter, extra reasonable

Meta resumes AI coaching utilizing EU person knowledge

Regulation Enforcement Used Webloc to Observe 500 Million Gadgets by way of Advert Information

Right now’s NYT Mini Crossword Solutions for April 12

Wharton Professor Stewart Friedman’s TOTAL Management Framework

Brokers don’t know what attractiveness like. And that’s precisely the issue. – O’Reilly

Main Menu

Subscribe to Updates

What's Hot

Brokers don’t know what attractiveness like. And that’s precisely the issue. – O’Reilly

The Dreyfus entice

Conduct versus capabilities

What ought to an agent personal then?

The sociotechnical hole no person is speaking about

Present techniques

What I take away from this

Related Posts