Hacker News new | ask | show | jobs
by zozbot234 556 days ago
It's not "really slow" at all, 1 tok/sec is absolutely par for the course given the overall model size. The 405B model was never actually intended for production use, so the fact that it can even kinda run at speeds that are almost usable is itself noteworthy.