Hacker News new | ask | show | jobs
by porridgeraisin 503 days ago
Yeah, I don't think they are a useful measuring stick for LLMs.

My amateur opinion is that an "AI system" resembling AGI or ASI or whatever the acronym of the day is, will be modular, with different parts addressing different kinds of learning, rather than entirely end to end. One of the main milestones towards achieving this would be the ability to dynamically learn what is left to be learnt (finding gaps), and then potentially have it train itself to learn that, automatically. One of the half-milestones, I suppose, would be for humans to find gaps in the the ability first of all.

I attend a talk recently where they presented research that tried to distinguish effectively the following two types of LLM failures:

1) inability to generalize/give the output at the "representation layer" itself

2) has the information represented, but is not able to retrieve it for the given reasonable prompt, and requires "context scaling"

Which is a step towards this goal I suppose.