Hacker News new | ask | show | jobs
by skydhash 754 days ago
> I sometimes wants a search engine, sometimes a question engine.

If you want a search engine, it's easy to use the results as a feedback to refine the query. But a question (answer?) engine would need to be an expert in the subject. And not parroting stuff. That usually means curation. You need something to do the work ahead to filter the wheat from the shaft. I don't see how LLMs can do that.

LLMs can't be a search engine, and can't be an question engine. The best way to treat it is a simulation engine, but the use cases depend on the training data. But the proof is there that the internet is full of junk, and not that expansive.

1 comments

> I don't see how LLMs can do that.

If it's in the training data, then it should be able to do that. That is to say, a comment's points matter. and the subreddit it's on. and who said it, and how the rest of their comments do/where they are. The LLM could annotate the unredacted reddit dataset with metadata as to where to rate it on the words used, the accuracy of the information, the sarcasm quotient, the hilarity quotient, how condescending the comment is; all of that an LLM could generate metadata about and feed into itself to get better and better.