Coding Agents
What a time to be alive! A few years ago I was highly sceptical of all the super intelligence scaremongering. When the first coding agents came out I found them helpful for writing docstrings and some boilerplate, but they were mostly crap. I certainly didn’t think by 2026 agents would be writing much of the code I produce! I’m still highly sceptical of super intelligence scaremongering - the loudest voices are the ones who stand to gain the most - but I am no longer sceptical about AI’s potential for significant disruption. That is until we run out of energy to power data centres, but that’s a topic for another day.
I’ve been using coding agents (mostly Claude Code) every day for the last few months and my goodness are they helpful! Most of the marketing around coding agents focuses on increased developer output and there has certainly been plenty of dick swinging on social media about how many daily pull requests one can create. I’m not sure I’m producing code much faster with coding agents as I spend a lot more time thinking how to articulate what I’m trying to achieve and pore over the agent’s plan with a fine tooth comb, making sure it’s perfect before letting it execute. What has changed for me is the cognitive cost of writing code.
Having Tourette’s syndrome I spend much of my time in front of the keyboard ticking rather than typing. Although I can type quite quickly, spending half of my time not typing dramatically reduces my output velocity and thus I work twice as hard to compensate. Nowadays I spend much less time typing, so coding agents have been a great leveller in this regard. Having ADHD causes all kinds of problems when it comes to knowledge work. I’m either not focused enough and struggle to get things finished or I am hyper focused, which is exhausting and to the detriment of everything else around me. You could say my context window requires far more engineering than a neuro-normie! Coding agents have been a great leveller because they redistribute my cognitive load. I don’t need to sustain the same intensity of focus whilst delivering at least as fast as before and can embrace my tendency to get distracted by other tasks whilst the agent is plugging away. This is my actual productivity gain throughput metrics don’t capture.
Once the plan is nailed down I find agents tend to implement things pretty well. Some patterns I’ve found effective: red/green TDD, especially if I have the agent wait for me to review the tests before proceeding to implementation; having a different agent review the implementation (e.g. ask Codex to review Claude’s work); and providing a few simple shell scripts the agent can run to perform common actions. For example, I have a get-jira-ticket script, which gets the ticket description for the provided ticket ID using Atlassian’s API. Even if the agent could do this itself given the Atlassian MCP server, writing one-liners to perform common actions is far more context efficient and less error prone. The only MCP server I do use regularly is Kindly Web Search paired with a SearXNG instance running locally, which I find to be better than the built in web search and content fetching tools.
The one area I’ve found agents struggle with is data-heavy work. I was recently working on a statistical framework for comparing agentic systems involving lots of data frame manipulations, bootstrapping, and statistical tests. Both Claude and Codex made a complete pig’s ear of it! They failed to use Pandas properly, preferring loops and standard Python control statements over grouped operations and data frame masks. My sense is this was due to the lack of an interactive Python environment in which the agent can inspect the data and gradually build up chains of data frame operations, stopping to inspect the output of each step (along the lines of the harness used in Recursive Language Models). At least, this is how I approach these tasks. I assume it would also be an effective approach for an LLM, but is quite different to the TDD-style approach taken in general software development.
Overall I’m thoroughly enjoying using coding agents as it turns out I like building things more than I like coding specifically. I do hope Jevons paradox holds, though, otherwise I might be out on the street this time next year!