It is not far off to say that coding agents have been the most visible and widely adopted AI product to date.
Agent offerings like Claude Code, Codex and GitHub Copilot are already used across many enterprises.
On the other hand, we do not see the same level of success in other white collar domains or in day to day life. We talk about agents doing everything, but very little work has actually been handed off in areas like finance, legal or healthcare.
Even within IT, functions like project management, design and architecture are using LLMs, but have not fully offloaded work in the way coding has.
So what made coding such a good problem for LLMs to solve?
Verifiability 🔗
Real world problems are messy. Problem statements are unclear and constantly changing.
- A marketing strategy that worked last year may not work today
- Laws change over time
- Copy written yesterday may not perform today
There are no clear objective answers
LLMs struggle in environments like this.
Coding is different.
- Languages and frameworks are well defined
- Code compiles or it does not
- Tests pass or fail
- Programs run or crash
Outputs are objective and predictable.
This allows LLMs to generate and evaluate their own output.
- Generate code
- Run it
- Observe failure
- Fix it
Fast feedback loop 🔗
Many domains have slow feedback cycles.
- Marketing takes weeks
- Strategy takes months
In coding, feedback is immediate.
- Errors show up instantly
- Tests fail quickly
- Output is visible right away
This gives the model constant signals to adjust.
- feedback is fast
- feedback is frequent
- feedback is actionable
Few domains have this property.
Systems, not models 🔗
Coding agents are not just LLMs with API access.
They are systems built around the model.
- LLM reasoning
- tool usage
- execution environments
- test runners
- diff based editing
- retry loops
Most simple agent setups miss this.
The model alone is not reliable enough. The system around it makes it work.
In many cases, the LLM is the least reliable part of the stack.
Natural modularity of the domain 🔗
LLMs perform better on smaller, well defined tasks.
Software is naturally structured this way.
- functions
- classes
- files
- services
Agents can operate locally.
- modify one function
- update one file
- fix one test
They do not need full understanding of the entire system to be useful.
High tolerance for imperfection 🔗
In many domains, mistakes are costly.
- legal errors have consequences
- financial mistakes are expensive
- medical mistakes are dangerous
Coding is more forgiving.
- code fails fast
- errors are visible
- tests catch issues
- changes can be reviewed
Imperfect output is still useful.
- partially correct code can be fixed
- progress can be incremental
The system allows iteration instead of requiring perfection.
Built on existing infrastructure 🔗
Coding agents benefit from existing tooling.
- version control
- CI/CD pipelines
- automated tests
- linters
These systems already provide validation and structure.
Agents can plug into them directly.
Other domains do not have this level of built in feedback and automation.
Text as the interface 🔗
LLMs work in text.
Code fits naturally into that.
- structured
- predictable
- machine readable
There is no translation gap.
- model writes code
- system executes code
This direct mapping is rare in other domains.
Supporting factors 🔗
A few things helped accelerate adoption.
- developers are early adopters and comfortable with new tools
- cost was often subsidised or bundled
These helped adoption, but they are not the main reason for success.
Closing 🔗
Most other domains lack these properties.
- outputs are verifiable
- feedback loops are fast
- tooling already exists
- tasks are modular
- imperfection is acceptable
That is why coding saw real adoption first.
If we want to understand where agents will work next, the question is simple:
Which other domains look like this?