Hacker News new | ask | show | jobs
by moffkalast 989 days ago
Hardly anyone can even run a 70B model, let alone 180B. Any anecdata will be extremely rare.
1 comments

In theory one can have 640G = 8 * 80G A100s memory and launch it. 180B Falcon with fp16 will be 360G, so there would be enough memory. It's definitely going to be very expensive indeed.
Llama.cpp can run quantized Falcon on a top end Mac Studio, which is only five grand: https://twitter.com/ggerganov/status/1699791226780975439

If I'm paying a third party a hundred bucks a month, I'd at least want them to be able to match the capacities of consumer hardware.