| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Spivak 1221 days ago

Look I know that "user is holding it wrong" is a meme but this is a case where it's true. The fact that LLMs contain any factual knowledge is a side-effect. While it's fun to play with and see what it "knows" (and can actually be useful as a weird kind of search engine if you keep in mind it will just make stuff up) you don't build an AI search engine by just letting users query the model directly and call it a day.

You shove the most relevant results form your search index into the model as context and then ask it to answer questions from only the provided context.

Can you actually guarantee the model won't make stuff up even with that? Hell no but you'll do a lot better. And the game now becomes figuring out better context and validating that the response can be traced back to the source material.

3 comments

stdgy 1221 days ago

The examples in the article seem to be making the point that even when the AI cites the correct context (ie: financial reports) it still produces completely hallucinated information.

So even if you were to white-list the context to train the engine against, it would still make up information because that's just what LLMs do. They make stuff up to fit certain patterns.

link

williamcotton 1221 days ago

That’s not correct. You don’t need to take my word for it. Go grab some complete baseball box scores and you can see that ChatGPT will reliably translate them into an entertaining English paragraph -length outline of the game.

This ability to translate is experimentally shown to be bound to the size of the LLM but it can reliably not synthesize information for lower complexity analytic prompts.

link

joe_the_user 1221 days ago

You don't build an AI search engine by just letting users query the model directly and call it a day.

Have you ever built an AI search engine? Neither have Google or MS yet. No one knows yet what the final search engine will be like.

However, we have every indication that all of the localization and extra training are fairly "thin" things like prompt engineering and maybe a script filtering things.

And given that despite ChatGPT's great popularity, the application is a monolithic text prediction machine and so it's hard to see what else could be done.

link

nl 1221 days ago

Who is this "you" you speak of when you say "you don't build an AI search engine by just letting users query the model directly and call it a day."

Because Microsoft might not have exactly done that, but it isn't far off it.

link