Hacker News new | ask | show | jobs
by recsv-heredoc 6 hours ago
The harness is really important. It matters so much - possibly even more than the model. We had harness crashes after running many agents - granted we were doing quite a bit with it. Grok Build (as a product) review here:

UX friction behind is worse than Claude Code - but seems to be a strange positioning choice - they're more on the 'vibe' side than the 'agentic engineering' things.

Largest issue was actually reviewing output - but if you're going to largely make that opaque from the user, why choose a CLI-based interface that's so mouse-heavy?

There's also problems with the actual model. Thinking is visible, and every interaction goes like this:

"I would like you to investigate adding an API route to tackle x,y,z" *Grok, thinking: Okay - the user has asked me to add an API route to tackle x,y,z"

Also absolutely absurd other quirks - "I have no tools available in my context" being visible in the CoT.

The auto-approval (yellow, auto-mode) review of Claude Code via Opus is a killer feature - every build-it CLI should be offering this for long horizon tasks.

Messaged one of the engineers about our experience - no feedback.

You'd be better off with Claude Code 5x Max than the 300 USD/month subscription.