| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by BoredomIsFun 105 days ago
	> Local inference for chats sucks. /r/SillyTavernAI would disagree with you.

2 comments

otabdeveloper4 103 days ago

Roleplay isn't chat, it's gaming.

Yes, gaming is (of course) a big use case for LLMs.

link

g947o 105 days ago

Many people who use ST have a "serious" nvidia card.

We are talking about NPUs here.

link

BoredomIsFun 105 days ago

Are you kidding? A good ratio of ST folks run finetunes of Mistral Nemo (if it tells you anything). Anyway your core statement is simply wrong ("local chat sucks").

link

g947o 104 days ago

From their own GitHub:

> If you intend to do LLM inference on your local machine, we recommend a 3000-series NVIDIA graphics card with at least 6GB of VRAM, but actual requirements may vary depending on the model and backend you choose to use.

Also, please be respectful when discussing technical matters.

P.S. I didn't say "local chat sucks".

link

BoredomIsFun 104 days ago

> we recommend a 3000-series NVIDIA graphics card with at least 6GB of VRAM

...which is not by any means a powerful GPU, and besides the AMD Ryzen AI CPUs in question have a plenty enough capacity to run local LLMs esp. MoE ones; with 3b active MoE parameters miniPC equipped with these CPUs dramatically outperform any "3000-series NVIDIA graphics card with at least 6GB of VRAM".

> please be respectful when discussing technical matters.

That is more applicable to your inappropriately righteous attitude than to mine.

link