Hacker News new | ask | show | jobs
by araghuvanshi 786 days ago
Well the same principle of false advertising re: context window sizes also applies to its inability to count, no? AI companies claim that their models can do math, so wouldn't a regular developer assume that they can also count?

And if I can't trust a so-called SOTA model to partially answer - say, recall each mention of the word "wizard" instead of just giving me the wrong answer - then why should I trust it to list out specific scenes? That's even harder to benchmark.