Hacker News new | ask | show | jobs
by donsupreme 514 days ago
> I've tried their 7b model

Anything other than their 671b model are just distilled models on top of Qwen and Llama using their 671b reasoning data output, right?

1 comments

Correct. Its the best model I've been able to run locally, by a long shot