Hacker News new | ask | show | jobs
by logicchains 1128 days ago
You can get around 4-5 tokens per second on the 65B LLaMA with a 32 core 256GB ram Ryzen CPU, not sure how much it costs to build but can rent one from Hetzner for around two hundred bucks a month.