| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by kasey_junk 83 days ago

I think it really depends on how fully formed you ai workflows are. I have a very opinionated set of skills and agents files and a harness for running prompts against both for code production.

I do head to head comparisons with this setup pretty regularly and what I’ve found is there is not much difference in outcomes between the 2 frontier labs at equivalent model settings. It’s hard to get statistically significant results on my budget and eval ability but my anecdotal feeling is that there is as much difference in group as out in outcomes.

Given that setup I use codex much more than Claude because it’s more reliable.

But I believe it’s easier to go from nothing to decent with Claude.

For other stuff I use Claude.