|
|
|
|
|
by cjbprime
930 days ago
|
|
I think this is missing the main reason to use Apple Silicon, which is that your dedicated graphics card probably has 24GB or less of RAM, whereas e.g. an M2 Ultra Mac Studio can have 192GB of RAM with a far superior memory bandwidth to anything on x86. This is important because even a "small" LLM like Llama2 13B would require quantization to fit in the 24GB RAM that the dedicated graphics card will give you, whereas the Mac could run Llama2 70B without quantization (at FP16). |
|