Hacker News new | ask | show | jobs
by osmarks 446 days ago
You could just run a local LLM over every document and ask it "is this related to this query". I don't think you actually want to wait a week (and holding all the documents you might ever want to search would run to petabytes).

(the reasonable way is embedding search, which runs much faster with some precomputation, but you still have to store things)

2 comments

A better way would be to ask the LLM to generate keywords (or queries). And then use old school techniques to find a set of documents, and then filter those using another LLM.
How is that better than embeddings? You’re using embeddings to get a finite list of keywords, throwing out the extra benefits of embeddings (support for every human language, for instance), using a conventional index, and then going back to embeddings space for the final LLM?

That whole thing can be simplified to: compute and store embeddings for docs, compute embeddings for query, find most similar docs.

Yes, you can do the "old school search" part with embeddings.
Ah, I had interpreted “old school search” to mean classic text indexing and Boolean style search. I’d argue that if it’s using embeddings and cosine similarity, it’s not old school. But that’s just semantics.
The entire library of Congress is like 10TB. You don’t need anything near petabytes until you get out of text into rich media.
Common Crawl is petabytes. Anna's Archive is about a petabyte, but it includes PDFs with images.