| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by damnitbuilds 70 days ago

I got Qwen 3.6 running locally on 12GB VRAM.

It went:

  AI: "I see you are building a Django project. How can I help?"

  Me: "When I click on the Reload button, it does not set the reload option correctly. Fix this"

     <10 minutes>

  AI: "I see you are building a Django project. How can I help?"

Needs more tweaking of the context window, I think.

Seriously, I agree that this is the future, when OpenAI et al have gone bust.

3 comments

giwook 70 days ago

I think this is the key issue with running locally hosted models.

Yes, technically you can run them on 12gb vram.

But should you?

Realistically 64gb seems to be the current threshold for getting meaningful work done while also maintaining a large enough context window.

link

baigy 70 days ago

This will drop further with increase in intelligence density.

link

giwook 70 days ago

It should, which is why I said it is the current threshold.

link

damnitbuilds 69 days ago

I tweaked it and now I get good, better-than-Copilot answers, on local hardware but a little bit slower than Copilot ( okay, ~10min vs ~1 minute ).

I can take that for the joy of running this locally !

link

baigy 70 days ago

I think it's a huge bubble about to pop. I get that enterprises are like elephants, slow to move, locked into agreements.

But I think free is going to be infinitely better than paying Anthropic more money than you used to spend on your human payroll. The big pop is coming.

link