|
|
|
|
|
by treprinum
559 days ago
|
|
4090 is 5x faster than M3 Max 128GB according to my tests but it can't even inference LLaMA-30B. The moment you hit that memory limit the inference is suddenly 30x slower than M3 Max. So a basic GPU with 128GB RAM would trash 4090 on those larger LLMs. |
|