Hacker News new | ask | show | jobs
by pocketarc 1045 days ago
You are right about that being the cheapest, of course, in the sense that 64gb of HDD space is always going to be cheaper than RAM. But when you say

> thousands for RAM

I wonder if your perspective might be a little off - you can get 64GB DDR4 RAM for ~$100, it’s really not a big deal these days.

It’s a big deal on Mac, of course, where 64GB means big kitted out high-end model that costs thousands, but RAM really is that cheap.

1 comments

Understandable; the reason I said "thousands for RAM" was because when I made that sentence, I put the theoretical RAM and GPU prices together. Oh well.
My apologies, I think the bit of context missing from my response is you don't need a GPU at all; 64GB of RAM will suffice to run a 70B model with your CPU, and it won't even be -that- slow, you'll get a few tokens per second.

So while a lot of us think that you need to splurge in order to get into LLMs, the reality is you don't, not really, and pretty much any computer will run any model, thanks to the efforts of projects like llama.cpp. Even using the disk like you mentioned! That's a thing, too. It's slower, but it's entirely possible.

If you're willing to drop down to the 7B/13B models, you'll need even less RAM (you can run 7B models with less than 8GB of RAM), and they'll run radically faster.

People have been working really hard to make it possible to run all these models on all sorts of different hardware, and I wouldn't be surprised if Llama 3 comes out in much bigger sizes than even the 70B, since hardware isn't as much of a limitation anymore.