Hacker News new | ask | show | jobs
by jst1fthsdys 14 days ago
Define "much worse". I use DS v4, GLM, and some Kimi with omp personally, and have Cursor with latest Claude and GPT models at work. I notice zero difference in the work for my workflow between Opus and DS.

Really confused how people make these claims. Are you just basing this off benchmarks or your own personal work? Are you an experienced dev or just doing vibe coding?

2 comments

My own experience. I'm working on something complex that's not in the datasets these models were trained on. There I see V4 flash breaking down and hallucinating much more often than GPT/Claude. For normal, common tasks, I also don't see much of a difference.
Huge variation in how people prompt and use their models. Vibe coding with ambiguous requirements vs. multiple steps of precise planning are completely different imo