|
|
|
|
|
by mirekrusin
470 days ago
|
|
When running on apple silicon you want to use mlx, not llama.cpp as this benchmark does. Performance is much better than what's plotted there and seems to be getting better, right? Power consumption is almost 10x smaller for apple. Vram is more than 10x larger. Price wise for running same size models apple is cheaper. Upper limit (larger models, longer context) is far larger for apple (for nvidia you can easily put 2x cards, more than that it becomes whole complex setup no ordinary person can do). Am I missing something or apple is simply currently better for local llms? |
|