| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by calculito 802 days ago
	I assume the question is rather which LLM can cover most of the tasks while delivering decent quality. I would prefer an architecture using different LLM for different tasks rather like 'specialists' instead of simple 'agents'. I used to take the main task and divide it in smaller tasks and see what can I use to solve the problem. Sometimes rule-based approaches can be already enough for a sub-task and LLM would be not only overkill but also more difficult to implement and maintain.

1 comments

rahimrezgui 802 days ago

so what is your answer to the question?

link

calculito 801 days ago

Depends of what you want to do!? Just for testing most of the 7B model are a good compromise between quality and performance (speak execution time)

link

FezzikTheGiant 801 days ago

Is there a way to reliably package these models with existing games and make them run locally? This would virtually make inference free right?

What I think is, from my limited understanding about this field, if smaller models can run on consumer hardware reliably and speedily that would be a game changer.

link

talldayo 800 days ago

> This would virtually make inference free right?

Not really. Inference is never "free" unless you cache the result (which is just a static output) or unless you reduce complexity (which yields procedurally less-usable outputs).

link

FezzikTheGiant 800 days ago

Can you explain further? Why would it not be free if it's running locally

link

talldayo 800 days ago

I thought you meant "free" in terms of computational cost; running local is technically free of charge, but also requires a lot of processing power. Inferecing a properly-sized LLM will potentially starve the rest of your software from GPU/CPU/memory access, so you have to plan accordingly.

link