Hacker News new | ask | show | jobs
by evanmoran 85 days ago
I’m writing a new type of CRDT that supports move/reorder/remove ops within a tree structure without tombstones. Claude Code is great at writing some of the code but it keeps adding tombstones back to my remove ops because “research requires tombstones for correctness”.

This is true for a usual approach, but the whole reason I’m writing the CRDT is to avoid these tombstones! Anyway, a long story short, I did eventually convince Claude I was right, but to do it I basically had to write a structural proof to show clear ordering and forward progression in all cases. And even then compaction tends to reset it. There are a lot of subtleties these systems don’t quite have yet.

2 comments

Interesting. I'm the author of DocNode, a library that does exactly what you're describing; it might be useful. https://docukit.dev

Cheers!

I would strongly advise using Codex for a project like that
Please do elaborate. I’ve only tried switching to codex once or twice, and it’s been probably 3 months since I last tried it, but I was underwhelmed each time. Is it better on novel things in your experience?
My experience is that it is much more terse and realistic with its feedback, and more thoughtful generally. I trust its positive acknowledgements of my work more than claude, whose praise I have been trained to be extremely skeptical of.
In my experience, Codex / ChatGPT are better at telling you where you're wrong, where your assumptions are incomplete, etc., and better at following the system prompts.

But more importantly, as a coding agent, it follows instructions much better. I've frequently had Claude go off and do things I've explicitly told it not to do, or write too much code that did wrong things, and it's more work to corral it than I want to spend.

Codex will follow instructions better. Currently, it writes code that I find a few notches above Claude, though I'm working with C# and SQL so YMMV; Claude is terrible at coming up with decent schema. When your instructions do leave some leeway, I find the "judgment" of Codex to be better than Claude. And one little thing I like a lot is that it can look at adjacent code in your project so it can try to write idiomatically for your project/team. I haven't seen Claude exhibit this behavior and it writes very middle-of-the-road in terms of style and behavior.

But when I use them I use them in a very targeted fashion. If I ask them to find and fix a bug, it's going to have as much or more detail as a full bug report in my own ticketing system. If it's new code, it comes with a very detailed and long spec for what is needed, what is explicitly not needed, the scope, the constraints, what output is expected, etc., like it's a wiki page or epic for another real developer to work from. I don't do vague prompts or "agentic" workflow stuff.

GPT is much better at anything mathematical than Claude, as is Gemini. This is evidenced by their superior results at math Olympiads, the Putnam, etc.
How much is OpenAI paying you for this
Absolutely nothing. I have active subscriptions for both. Claude is better at FE stuff. Codex is better at actual programming.
How is FE not actual programming? I spend less time on FE than I once did, but it has presented some of the most interesting programming challenges I've encountered in my career. It's a large technical space, rich with 'actual' programming to be done.