Hacker News new | ask | show | jobs
by Slav_fixflex 99 days ago
As someone who builds with LLMs daily without being a developer, I notice quality differences more in practical output than benchmarks. Claude handles complex multi-step tasks better in my experience, but consistency is still the biggest challenge – same prompt can give very different results day to day.