No Speed Without Control: A disciplined approach to LLM-augmented software development

November 17, 2024

Since late 2022 I've been experimenting with using Large Language Models (LLMs) to augment my programming. I believe there is skill and discipline required to get the best out of them, and as such I've seen many software developers who are dismissive or uninterested. I think this is mistaken - once I learnt to play to their strengths, I've found LLMs let me move faster without sacrificing quality.

Initial disappointments

GitHub Copilot was the first LLM I used. I remember the first time I wrote a function signature and watched the exact code I intend to write materialise. It was mind blowing, but as someone who is paid write software, it also caused a degree of trepidation.

That fear was short lived because I quickly became disillusioned with GitHub Copilot. The initial thrill of using the product comes from how much friction it removes and how quickly you can generate code. But I soon found myself debugging issues caused by divergence between what I had assumed Copilot had written and reality. Then there is the ever present threat of hallucinations - sometimes it just completely makes things up.

But most problematically, Copilot's effortless generation makes it extremely tempting to "author" code you don't fully understand. I found this made me less rigorous about my coding and reduced my sense of accomplishment. I eventually stopped using Copilot, fearing that I was evolving systems faster than I was understanding them, and anticipating that my short-term gains would lead to long-term pains.

Despite this, I could see that AI had the potential to be a force multiplier for software development. I just needed to inject the right friction back into the process to retain rigour, control and ownership. I have found the chat interface pioneered by ChatGPT encourages a much healthier path to AI-augmented programming.

Here are the techniques I find make me a faster and more knowledgeable developer, without compromising control and mastery of the craft. But before, we need to talk ethics.

Responsible use

Be mindful of whether you have permission to share the code you're working on with the LLM you're using. Familiarise yourself with your preferred LLM's policies around handling of conversation history.

Your employer may have justifiably strict rules about sharing proprietary code - understand them.

Know whether the project you're contributing to or the company you work for is comfortable accepting AI generated code. Copyright law is playing catch-up with who owns AI generated content.

Some advice in this post should be acceptable regardless of circumstance, but the onus is on you to act responsibly.

Now, onto the techniques ...

Do not copy & paste LLM output - type it out yourself

This is a simple rule that makes a big difference. I find reading and then typing out LLM responses keeps me "in-the-loop" and engaging with the suggestions, rather than blindly accepting them. This extra friction gives me time to acknowledge and research parts of the code I don't understand, and eliminates the risk I'm pasting code which is different from what I assume.

Often, the generated code is only 95% of what you want. If you're just copying and pasting, the temptation is just to live with that and "move fast", but if you're typing it out anyway, you may as well write the code you actually want.

Copy & paste as much as you can into your prompts

The flip side of not copying output is that I find the more context I give LLMs, the better the results. This means you want to make it easy to import relevant code into your conversations, as removing friction here will make you more likely to add useful context.

That said, re-read the "Responsible Use" section and ask yourself if you have permission to do this.

GitHub Copilot for VS Code makes importing context into a conversation easy, but this comes at the cost of being explicit, and I don't like VS Code or the vendor-locked ecosystem Microsoft are trying to establish.

A portable approach is to add files to your prompts in this format:

path/to/file.js 
```
...file contents...
```

Giving the LLM the whole of the file and its path allows it to understand the relationship between files you post.

This Vim mapping adds the current file to your paste buffer in this format by pressing <Space>ly ([L]lm [Y]ank):

nnoremap <Space>ly :let @+ = expand('%') . "\n```\n" . join(getline(1,'$'), "\n") . "\n```"<CR>

I also have a simple script which prints a given file in this format. On macOS I can then pipe this to my clipboard using pbcopy (you'll need something like xclip on Linux).

Use the best model you can

LLMs are not created equal and, although it's hard to quantify, in my experience there is a tangible difference in the quality of responses between models. I recommend experimenting with different models, and paying for access to better ones. At the time of writing I prefer Anthropic's Claude 3.5 Sonnet but switch to ChatGPT's o1-preview for more complex tasks. However, this is a fast moving space and these recommendation are likely to become outdated quickly.

Before choosing a model ensure you understand their usage policy and handling of conversation data.

A problem well-stated is a problem half-solved

I highly recommend reading OpenAIs Prompt Engineering Guidelines, but the TL;DR is: be explicit about what you want.

You can get pretty decent results by blindly throwing half-formed questions at an LLMs, but I find the magic happens when you ask detailed questions which include a clear definition of what you're trying to solve and what type of solution you want.

Write your prompts in a text editor

Unfortunately, the prompt UI text boxes in LLMs are usually too small to write the longer more detailed prompts which generate the best results. Additionally, there's nothing more galling than writing a long prompt then accidentally refreshing your browser and deleting it.

As such, for long queries I find it better to draft them in a text editor then paste them in.

An interactive rubber-duck

Rubber Duck Debugging is the process of describing the programming issue you're facing to an inanimate object (such as rubber duck) in order to force yourself to state all your assumptions in natural language. This process is remarkably effective at surfacing the underlying issue.

Instead of talking to an inanimate duck, write a prompt to your preferred LLM. You'll get all the benefits of rubber ducking, but also get a free shot at getting the answer from the machine.

Ask stupid questions

LLMs offer a safe space to ask the most esoteric or asinine questions you'd be too embarrassed to ask an actual human and get an answer which is sometimes surprisingly thoughtful. Again, the more detail you add, the better the answer - don't worry, you're not wasting anyone's time but your own.

Code reviews are free & instantaneous now

Before proceeding, I will once again refer you back to the "Responsible Use" section.

Assuming you have permission to share your code, LLMs make it trivial to get a quick review on a change set by sending the prompt a diff. Simply pipe a git diff to your paste buffer:

# Assuming `main` is your default branch and you're on a Mac
git diff main.. | pbcopy

Then use a prompt in the form of:

Please review my changes for correctness and clarity:

<optional change description>

<the git diff>

This is such a cheap way to get feedback I find it a total no-brainer. Working as a solo developer this means actually getting reviews, but if you work on a team this is a nice way to catch silly mistakes and save other reviewers time.

Seek different perspectives

The above is a simple template for generic feedback, but I find reviews more engaging when I ask the LLM to add a bit of character. Here are some fun prompt addenda to try:

What would <programmer you respect> say about these changes?
- I like to use Rich Hickey, Casey Muratori or John Carmack.
How could I align these changes with
?
- Try Functional Core, Imperative Shell by Gary Bernhardt or Data Oriented Programming by Mike Acton.
Please consider <a style guide>.
- I like Tiger Beetle's TIGER_STYLE

Finding balance

With these techniques, I feel I have found balance. It's still me authoring the code; I still feel ownership of the results. But I now have someone with limitless patience to "riff" on ideas with, catch mistakes, and take some of the drudgery out. I hope these ideas help you find your own balance, and let you ship better software.

Analog Moment

by James Cox-Morton (@th3james)