| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mechagodzilla 423 days ago
	I use a dual-socket 18-core (so 36 total) xeon with 768GB of DDR4, and get about 1.5-2 tokens/sec with a 4-bit quantized version of the full deepseek models. It really is wild to be able to run a model like that at home.

1 comments

stirfish 423 days ago

Dumb question: would something like this have a graphics card too? I assume not

link

mechagodzilla 422 days ago

Yeah, it was just a giant HP workstation - I currently have 3 graphics cards in it (but only 40GB total of VRAM, so not very useful for deepseek models).

link