Hacker News new | ask | show | jobs
by pxc 315 days ago
Is the DeepSeek model you're running a distill, or is it the 671B parameter model?