| I've found GPT-5-Codex (the model used by default by OpenAI Codex CLI) to be superior but, as others have stated, slower. Caveat, requires a linux environment, OSX, or WSL. In general, I find that it will write smarter code, perform smarter refactors, and introduce less chaos into my codebase. I'm not talking about toy codebases. I use agents on large codebases with dozens of interconnected tools and projects. Claude can be a bit of a nightmare there because it's quite myopic. People rave about it, but I think that's because they're effectively vibe-coding vastly smaller, tight-scoped things like tools and small websites. On a larger project, you need a model to take the care to see what existing patterns you're using in your code, whether something's already been done, etc. Claude tends to be very fast but generate redundant code or comical code (let's try this function 7 different ways so that one of those ways will pass). This is junior coder bullshit. GPT-5-Codex isn't perfect but there's far far less of that. It takes maybe 5x longer but generates something that I have more confidence in. I also see Codex using tools more in smart ways. If it's refactoring, it'll often use tools to copy code rather than re-writing the code. Re-writing code is how so many bugs have been introduced by LLMs. I've not played with Sonnet 4.5 yet so it may have improved things! |