| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by thorum 458 days ago
	In the age of local LLMs I’d like to see a personal recommendation system that doesn’t care about being scalable and efficient. Why can’t I write a prompt that describes exactly what I’m looking for in detail and then let my GPU run for a week until it finds something that matches?

7 comments

osmarks 458 days ago

You could just run a local LLM over every document and ask it "is this related to this query". I don't think you actually want to wait a week (and holding all the documents you might ever want to search would run to petabytes).

(the reasonable way is embedding search, which runs much faster with some precomputation, but you still have to store things)

link

amelius 458 days ago

A better way would be to ask the LLM to generate keywords (or queries). And then use old school techniques to find a set of documents, and then filter those using another LLM.

link

brookst 458 days ago

How is that better than embeddings? You’re using embeddings to get a finite list of keywords, throwing out the extra benefits of embeddings (support for every human language, for instance), using a conventional index, and then going back to embeddings space for the final LLM?

That whole thing can be simplified to: compute and store embeddings for docs, compute embeddings for query, find most similar docs.

link

amelius 458 days ago

Yes, you can do the "old school search" part with embeddings.

link

brookst 458 days ago

Ah, I had interpreted “old school search” to mean classic text indexing and Boolean style search. I’d argue that if it’s using embeddings and cosine similarity, it’s not old school. But that’s just semantics.

link

osmarks 458 days ago

https://arxiv.org/abs/2212.10496

link

kortilla 458 days ago

The entire library of Congress is like 10TB. You don’t need anything near petabytes until you get out of text into rich media.

link

osmarks 458 days ago

Common Crawl is petabytes. Anna's Archive is about a petabyte, but it includes PDFs with images.

link

pizza 458 days ago

It's worth pointing out that even with the largest models out there, coherence drops fast over length. In a local home ML setup, until somebody radically improves long-term coherence, models with < x memory may be a diametrically opposed constraint to something that still says the right thing after > y minutes of search.

link

whiplash451 458 days ago

Why would it take a week?

Is this because you want it to continuously watch for live data that could match your need?

link

mdp2021 458 days ago

Because thinking takes time.

link

r4ndomname 458 days ago

This is exactly what I am hoping to get sometimes (but I would say, 1 week is maybe a little long).

If I go through my current tasks and see, that for some task I need a set of documents, emails, .., why cant I just prompt the system to get it in 30-ish minutes. But as someone already stated Apple Intelligence is supposed to fill this gap.

link

mdp2021 458 days ago

> maybe a little long

Many of us have ongoing problems pending for years - for just "a week", "where do I sign".

It really depends on the task.

link

bryanrasmussen 458 days ago

this is sort of like a dream I had https://medium.com/luminasticity/the-county-map-of-the-world...

>The idea was that he could graft queries in this that he did not expect to finish quickly but which he could let run for hours or days and how freeing it was to do more advanced research this way.

link

fhe 458 days ago

or it keeps monitoring the web and notify me whenever something that matches my interests shows up -- like a more sophisticated Google alert. I really would love that.

link

desdenova 458 days ago

Why can't you?

Just run the biggest model you can find out of swap and wait a long time for it to finish.

You'll obviously see more focus on smaller models, because most people aren't willing to wait weeks for their slop, and also don't have server GPU clusters to run huge models.

link

HeatrayEnjoyer 458 days ago

> Just run the biggest model you can find out of swap

This kills the SSD

link