Hacker News new | ask | show | jobs
by _zagj 99 days ago
> I feel like anyone used AI coding tools before 11/25 and after 1/26 (with frontier models) will say there has been a massive jump in, there is a difference between whether LLM can do a specific task or pass some arguably arbitrary checks by maintainers vs. what the are capable of.

How much of that is the model and how much of that is the tooling built around it? Also why is the tooling, specifically Claude Code, so buggy?

1 comments

90% model if not more, look at terminal benchmark terminus tool, that mostly proves it