Hacker News new | ask | show | jobs
by CGamesPlay 670 days ago
> You can run Llama 3.1 70B — the big Llama — for LLM jobs.

That's the medium Llama. Does anyone know if an L40S would run the 405B version?

1 comments

Hi, I'm the person that wrote that sizing comment in the draft for this article. I have been trying for a while and have been unsuccessful at getting 405B running on any of the GPU machines. I suspect I'd need a raw 8xA100 node to do it at Q4. I doubt there is any reasonable combination of L40s cards that can do it on fly.io. It's just too big. I suspect that in time the 70b model will be brought up to be roughly equivalent, but realistically it's already on the GPT-4 threshold as is. I've found that 70b is more than sufficient in practice.
Be that as it may, Llama 3.1 70B is not the big Llama.
I fixed it.