|
|
|
|
|
by bt1a
814 days ago
|
|
I think there's more than a few enthusiasts who would be very interesting in buying 1 or more of these cards (if they had 32+ GB of memory), but I don't have any data to back that opinion up. It is not only those who can't afford a 4090 though. While the 4090 can run models that use less than 24GB of memory at blistering speeds, models are going to continue to scale up and 24GB is fairly limiting. Because LLM inference can take advantage of splitting the layers among multiple GPUs, high memory GPUs that aren't super expensive are desirable. To share a personal perspective, I have a desktop with a 3090 and an M1 Max Studio with 64GB of memory. I use the M1 for local LLMs because I can use up to 57~GB of memory, even though the output (in terms of tok/s) is much slower than ones I can fit on a 3090. |
|
I would gladly buy a card that ran a touch slower but had massive Vram, especially if it was affordable, but I guess that puts me into that camp of enthusiasts you mentioned.