Hacker News new | ask | show | jobs
by firesteelrain 36 days ago
How much better is it than Claude? I have both but Claude sucks up so many tokens.
8 comments

5.5 is absolutely comparable to opus 4.7 (both on highest effort), maybe even better. It generally seems less lazy, faster, and writes code closer to what I'd write. The only downside is that for very very long tasks, it can kind of lose track of the goal. For tasks under ten minutes I'll go with codex every time.
The main difference is in the frontend skills. GPT produces terrible design. What I do these days is ask Opus to produce an HTML mockup, then feed it to Codex.
I have not had problems with long goals. I let it chomp for 40 minutes on a proof in my custom theorem prover (xhigh fast), and it got there. Very happy with Codex, I ditched Claude for it.
They've added a new goal mode that might help with that
I switched some time after Anthropic bricked their models with adaptive thinking. It's a legit mystery to me how people are still using CC professionally.

Codex is far less frustrating and manages context better. It's also costing me about 1/3rd as much as Opus 4.7 on CC.

The only way to keep using CC for me has been to stick to 4.6 1M
Oh I didn't know you could type /model claude-opus-4-6 and still use it.

Thanks!

Yes, and /model claude-opus-4-6[1m] gets you the larger context window. Happy to help :)
Thanks for the hint, but is a large context window actually that useful? I tend to get garbage too often with a normal big context window.
IME, based on an in-house bench it's still good to about 20% on the 1M for 4.6 and 4.7 with a code base >50k loc. The trick I used before switching providers was to have it write a handoff when it hit ~18% of context and reset.

There are also many people running 4.5 with specific parameters that claim to be having luck.

The best way is to code in a way that doesnt break via model instruction nuances changing. 4.7 is superior
I stopped trying to use Claude to do anything with 4.7 because it sucks up so many tokens so quickly. I use the 4.6 model still and have switched to Codex for larger tasks. It also works better at more complex coding tasks than Claude for web apps that have python backends and typescript front ends.
I've been on the codex train for a few months now for personal stuff, but have Claude at work. I always tell people it's as good if not better than CC, but it has different strengths and weaknesses.

Claude was more autonomous and still is a little, but I think GPT 5.5 closed that gap a lot. Claude is far better at front end design. I think it's still better at big picture planning.

Codex is far better at code review and catching bugs that actually matter. I think it's better at following directions, although I think that regressed a bit with 5.5 (flip side of the autonomy I mentioned earlier). A lot of CC users claim to not like Codex's personality (or lack of), but personally I prefer it.

Less gibliterrating and more doing

Very fast

I just like that Openai let's you use your codex subscription with whatever harness you like. I prefer Pi, so that's what I use. GPT 5.5 xhigh feels equivalent to Opus to me, so there's no reason for me to be locked into the Claude Code cli. I use it off-and-on throughout my workday and never even come close to the pro limits.
Compaction is basically seamless which is a major weak point of Claude. At effort=low, Claude is better than codex but still slower. If you don't mind trading the upfront quality of work with additional micromanaging but at a faster speed, it is fine. I also think because of that very reason, you absorb more of the code.
I found it actually thinks about architecture and tests and not just spit out code with TODO in it like Claude.