Hacker News new | ask | show | jobs
by vunderba 89 days ago
Nice. You have a similar setup to mine. I've found OpenAI's gpt-5.3-codex (at xhigh) to be pretty dismal at actual implementation but it's surprisingly adept at performing a high level feature branch review and catching issues that are missed by Gemini/Claude.
1 comments

Yeah, through trial and error I found that Codex (now gpt-5.4) has really good feedback once there's a plan file.

Kinda like someone on the internet who suddenly becomes an expert once you make concrete claims for them to shred into.

I have an AGENTS.md prompt that specifies how to review a plan. Something like: for every finding, rank the top solutions internally and then recommend a solution. And if there's a simpler, better pivot that the plan should take, pitch it at the end.

It's really actionable, and I basically can paste the review back into Claude, then tell Codex to re-review the plan, and repeat this until it has nothing more to add, and now I have a really good plan.

Nice. I have a similar loop (Opus 4.6 plan <-> Codex xhigh review) that I kind of run in an iterative feedback loop until meaningful actionables converge down to zero.

I initially gave some thought to automating this because CC/Codex do have SDKs... but every once in a while Codex will propose some absurdly over-engineered stack advice that I have to manually reject before passing it back to Claude.

Man, we are on the the exact same page.

Yeah, I also thought about automating it before. Especially when I'm middlemanning noncritical plans where I don't care what the solution is.

I did make a skill /ask-codex that gets Claude to call Codex for one round of feedback which is pretty useful.

But for critical plans, I suppose being the middleman gives me a good opportunity to learn about the solution and its trade-offs as it's being incrementally refined. Especially in a domain I don't know much about.