Hacker News new | ask | show | jobs
by visarga 1574 days ago
> Isn't a system like GPT-3 currently limited to reflecting the ground truth data it has seen?

This limitation went away recently. A variant called RETRO (Retrieval-Enhanced Transformer) can use a search engine to take in the exact information up to date [1], assuming you can curate your own text corpus. It's also 25x smaller.

[1] https://deepmind.com/research/publications/2021/improving-la...

2 comments

That's really cool. But unless I am misunderstanding this, that still puts the burden on the existing web though right, it's just avoiding having to retrain the model? If there is no economical market for humans to produce new content about a topic how will the search engine find the "ground truth" content?
You might want to use a limited subset of the web, a curated list of sources or feeds. Apparently 1TB of text could be enough, just need to collect it or download it from a trusted source.
So, suppose there is a new kind of cocktail that is popular in bars near me that nobody has written about under it's new trendy name.

How do I ask this system about the recipe, or the history of the cocktail? Someone has to write an article about it, right? How do they get paid if it gets scraped once and people go to the scraping model for the answer instead of visiting the original article's page?

Give it two years and we might have passable agents running on phones. There'll be a sufficiently powerful and small model that you can use with 8gb ram or less on desktop within a year.

These first large language models are naive, unoptimized implementations of data structures we're learning to inspect and optimize. Something like retro that runs locally with a "just clever enough" service agent is so close to workable. I can't wait to see what happens in ML over the next two years, and who knows what kind of radical evolution the next big algorithm is going to bring.

Oh I totally see that, the issue I'm talking about isn't one of compute, but of high quality ground truth. This machine can hallucinate all kinds of information in perfect English already. The difficulty is that a good search engine needs to return more than just information that matches my query, it should return information that matches the objective reality people (and currently not the machine) inhabit. The machine needs text input to learn about the world; is the future going to look like companies hiring people to write essays about the world for machine consumption?

I think it's a similar problem we see today with ad-supported news being indexed by search engines, but taken to another magnitude when those articles need to be scanned by a model only once to have near perfect recall of the details.