| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ge96 36 days ago
	Nice, I recently pulled down TheBloke 7B mistral to try out I have a 4070.

3 comments

bashbjorn 36 days ago

I love mistral, but that model is... not the best. Maybe try out Gemma 4 e4b, it's a similar size to Mistral 7B, and should run great on your 4070 ("E4B" is slightly misleading naming).

link

ge96 36 days ago

Thanks for the tip, what do you use Gemma 4 e4b for?

link

redanddead 36 days ago

some say it’s a miniaturized gemini model

it’s good at writing, coding, decently intelligent

you can try it on nvidia nim

link

mixtureoftakes 36 days ago

7b mistral is quite outdated. On a 12gb 4070 you can run qwen 3.5 9b q4km or qwen 3.6 35b, the latter will be a lot smarter but also a lot slower due to ram offload.

Try both in lm studio, they really are surprisingly capable

link

ge96 36 days ago

I have 80gb of ram but it's slow capped by i9 CPU or specific asus mobo sucks I think only 2400mhz despite being ddr4

Tried all the stuff bios, volting

link

macNchz 36 days ago

Gemma 4 26B-A4B might be interesting to try on your machine. The latest optimizations make MoE models work pretty nicely on setups like that with a decent GPU and lots of slowish RAM. I have a 16gb GPU and 64gb of 3200mhz DDR4 and get 15-20 tokens/sec out of that model with zero finagling or tweaking. I’ve been very impressed by it, even having run just about every other open weight model that would fit on my machine over the last few years.

link

ge96 36 days ago

that seems slow? 15-20, was expecting 50-60 like mistral although I have not measured that yet on my setup

I've been asking other people but what do you use it for?

link

ganelonhb 36 days ago

I have a 2070 and can confirm it works amazingly fast.

I love TheBloke I wish he still made stuff

link

bashbjorn 36 days ago

Yeah, TheBloke era of local LLMs were good times. TBF Unsloth are doing a fantastic job of publishing quants of the major models quickly - they just don't have nearly the volume of "weird" models as TheBloke did.

link

ge96 36 days ago

What do you use it for? I'm still trying to use agents, I barely use copilot, only at work when I have to.

I didn't want to get personal with an LLM unless it was local so that's why I was setting this up but yeah. So far just research is what I was looking at.

link

paradox460 36 days ago

A lot of the same spirit lives on in TheDrunmer

They're mostly aimed at role play and sillytavern, but they're still generally good models, with lots of quants available

link