Article

Why Coding Agents Became the First Real Breakout Success of AI

It is not far off to say that coding agents have been the most visible and widely adopted AI product to date.

Agent offerings like Claude Code, Codex and GitHub Copilot are already used across many enterprises.

On the other hand, we do not see the same level of success in other white collar domains or in day to day life. We talk about agents doing everything, but very little work has actually been handed off in areas like finance, legal or healthcare.

Even within IT, functions like project management, design and architecture are using LLMs, but have not fully offloaded work in the way coding has.

So what made coding such a good problem for LLMs to solve?


Verifiability 🔗

Real world problems are messy. Problem statements are unclear and constantly changing.

  • A marketing strategy that worked last year may not work today
  • Laws change over time
  • Copy written yesterday may not perform today

There are no clear objective answers

LLMs struggle in environments like this.

Coding is different.

  • Languages and frameworks are well defined
  • Code compiles or it does not
  • Tests pass or fail
  • Programs run or crash

Outputs are objective and predictable.

This allows LLMs to generate and evaluate their own output.

  • Generate code
  • Run it
  • Observe failure
  • Fix it

Fast feedback loop 🔗

Many domains have slow feedback cycles.

  • Marketing takes weeks
  • Strategy takes months

In coding, feedback is immediate.

  • Errors show up instantly
  • Tests fail quickly
  • Output is visible right away

This gives the model constant signals to adjust.

  • feedback is fast
  • feedback is frequent
  • feedback is actionable

Few domains have this property.


Systems, not models 🔗

Coding agents are not just LLMs with API access.

They are systems built around the model.

  • LLM reasoning
  • tool usage
  • execution environments
  • test runners
  • diff based editing
  • retry loops

Most simple agent setups miss this.

The model alone is not reliable enough. The system around it makes it work.

In many cases, the LLM is the least reliable part of the stack.


Natural modularity of the domain 🔗

LLMs perform better on smaller, well defined tasks.

Software is naturally structured this way.

  • functions
  • classes
  • files
  • services

Agents can operate locally.

  • modify one function
  • update one file
  • fix one test

They do not need full understanding of the entire system to be useful.


High tolerance for imperfection 🔗

In many domains, mistakes are costly.

  • legal errors have consequences
  • financial mistakes are expensive
  • medical mistakes are dangerous

Coding is more forgiving.

  • code fails fast
  • errors are visible
  • tests catch issues
  • changes can be reviewed

Imperfect output is still useful.

  • partially correct code can be fixed
  • progress can be incremental

The system allows iteration instead of requiring perfection.


Built on existing infrastructure 🔗

Coding agents benefit from existing tooling.

  • version control
  • CI/CD pipelines
  • automated tests
  • linters

These systems already provide validation and structure.

Agents can plug into them directly.

Other domains do not have this level of built in feedback and automation.


Text as the interface 🔗

LLMs work in text.

Code fits naturally into that.

  • structured
  • predictable
  • machine readable

There is no translation gap.

  • model writes code
  • system executes code

This direct mapping is rare in other domains.


Supporting factors 🔗

A few things helped accelerate adoption.

  • developers are early adopters and comfortable with new tools
  • cost was often subsidised or bundled

These helped adoption, but they are not the main reason for success.


Closing 🔗

Most other domains lack these properties.

  • outputs are verifiable
  • feedback loops are fast
  • tooling already exists
  • tasks are modular
  • imperfection is acceptable

That is why coding saw real adoption first.

If we want to understand where agents will work next, the question is simple:

Which other domains look like this?