|
|
|
|
|
by bigyabai
48 days ago
|
|
It might run the smaller flash version, but 96gb is not enough for the trillion-parameter model. The M3 Ultra's GPU is a bit on the weak side for large-scale inference, so you'll be waiting on token prefill for most coding/agent workflows. |
|
Have you tried any other models with this M3 Ultra?