| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sunpazed 353 days ago
	Don’t have enough ram for this model, however the smaller 20B model runs nice and fast on my MacBook and is reasonably good for my use-cases. Pity that function calling is still broken with llama.cpp

3 comments

tarruda 353 days ago

It is fixed in this PR/branch: https://github.com/ggml-org/llama.cpp/pull/15181

link

codazoda 352 days ago

I'm glad to see this was a bug of some sort and (hopefully) not a full RAM limitation. I've used quite a few of these models on my MacBook Air with 16GB of RAM. I also have a plan to build an AI chat bot and host it from my bedroom on a $149 mini-pc. I'll probably go much smaller than the 20B models for that. The Qwen3 4B model looks quite good.

https://joeldare.com/my_plan_to_build_an_ai_chat_bot_in_my_b...

link

tempotemporary 350 days ago

what are your use cases? wondering if it's good enough for coding / agentic stuff

link