|
|
|
|
|
by noelwelsh
367 days ago
|
|
I think the main limitation, right now, is hardware. For GPUs the main limit is the VRAM available on consumer models. CPUs have plenty of memory but don't have the bandwidth or vector compute power for LLMs. This is why I think the Strix Halo is so exciting: it has bandwidth + compute power plus a lot of memory. It's not quite where it needs to be to replace a dedicated GPU, but in a few iterations it could be. I'm interested in other opinions. I'm no expert on this stuff. |
|