Hacker News new | ask | show | jobs
by TeMPOraL 983 days ago
> there's always a comment that says something along the lines of "ah did you test it on GPT-4"?

Perhaps because whenever there's "a news or article noting the limits of current LLM tech", it's a bit like someone tried to play a modern game on a machine they found in their parents' basement, and the only appropriate response to this is, "have you tried running it on something other than a potato"? This has been happening so often over the past few months that it's the first red flag you check for.

GPT-4 is still qualitatively ahead of all other LLMs, so outside of articles addressing specialized aspects of different model families, the claims are invalid unless they were tested on GPT-4.

(Half the time the problem is that the author used ChatGPT web app and did not even realize there are two models and they've been using the toy one.)