| HN Mirror

I'll see if I can run the experiment again with Codex, if not on the exact same project then a similar one. The advice I'm getting in the other comments is that Codex is more state of the art.

As a quick check I asked Codex to look over the existing source code, generated via Copilot using the GPT-5 agent. I asked it to consider ways of refactoring, and then to implement them. Obviously a fairer test would be to start from scratch, but that would require more effort on my part.

The refactor didn't break anything, which is actually pretty impressive, and there are some improvements. However if a human suggested this refactor I'd have a lot of notes. There's functions that are badly named or placed, a number of odd decisions, and it increases the code size by 40%. It certainly falls far short of what I'd consider a capable coder should be doing.