| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by rahimrezgui 802 days ago
	so what is your answer to the question?

1 comments

calculito 801 days ago

Depends of what you want to do!? Just for testing most of the 7B model are a good compromise between quality and performance (speak execution time)

link

FezzikTheGiant 801 days ago

Is there a way to reliably package these models with existing games and make them run locally? This would virtually make inference free right?

What I think is, from my limited understanding about this field, if smaller models can run on consumer hardware reliably and speedily that would be a game changer.

link

talldayo 800 days ago

> This would virtually make inference free right?

Not really. Inference is never "free" unless you cache the result (which is just a static output) or unless you reduce complexity (which yields procedurally less-usable outputs).

link

FezzikTheGiant 800 days ago

Can you explain further? Why would it not be free if it's running locally

link

talldayo 800 days ago

I thought you meant "free" in terms of computational cost; running local is technically free of charge, but also requires a lot of processing power. Inferecing a properly-sized LLM will potentially starve the rest of your software from GPU/CPU/memory access, so you have to plan accordingly.

link