Hacker News new | ask | show | jobs
by DrPhish 502 days ago
The Nano only has 4GB VRAM and DS-R1 is 671B FP8 parameters (equivalent to 671GB model size).

You need something with about 800GB to run the full model with context. You'd still need 400GB to even run a half-sized Q4 quant of R1, so there is no reasonable way that it would work.

2 comments

Just curious, what it's the cheapest card do you think that would be needed to run this model or something like llama 3.3-70B?

Only nvidia cards are compatible or AMD ones also could work?

OK, I understand the flagship model is huge, It seems to be far from local use.

Anyone did it with a smaller/distilled version, and getting good performance?