Hacker News new | ask | show | jobs
by ben_w 775 days ago
I've heard claims that context without forgetfulness has already been reached 2 months ago, but as I'm not a domain expert I don't trust that I can differentiate breakthroughs from marketing BS, and I definitely can't differentiate either of those from a Clever Hans: https://arstechnica.com/information-technology/2024/03/claud...
1 comments

I work in this field, so here's a comment with higher signal-to-noise ratio than you'll commonly find on HN when it comes to LLMs: notice how the demo use cases for very long context stuff deal almost universally with point retrieval, and never demonstrate a high degree of in-context learning. That is not coincidental. The ability to retrieve stuff is pretty great and superhuman already. The ability to reason about it or combine it in nontrivial ways leaves a lot to be desired still - for that you have to train (or at least fine tune) the underlying model. Which IMO is great, because it neatly plugs the gaps in human capability.