Hacker News new | ask | show | jobs
by vunderba 506 days ago
As has been pointed out by other HN users, these are models distilled from DeepSeek-R1 and are based on Llama and Qwen. There's almost zero chance OP has a computer capable of running the full 671b R1 model locally.

Realistically they might be able to run a 4-bit quant of the 70b model though.