|
|
|
|
|
by Nevermark
109 days ago
|
|
• Having NPU cores since the M1, would seem to verify that running models has been a game plan for a while. LLMs coming along can only have increased that focus. • Studios with Ultra Mx, now 4-way RDMA over Thunderbolt 5, and enormous RAM and SSD options, suggest a strong focus. I don't know what else that RAM would be intended for. Four Studio Ultras (total of 360 GPU cores with M5 Ultras?) with 2TB of unified RAM is a local model beast. • They refashioned their GPU cores to better support both graphic and neural processing, despite already having focused NPU cores. I would say they have been leaning into local models for several years. I expect we will see more models being optimized for smaller sizes, as demand for them increases. With hardware performance and neural focus trending up, and model requirements/quality trending down, the next few years will be interesting times. What would make me happy: Ultra x 2 (i.e. 2xUltra, 4xMax, 8xPro, 16xM5) packaging in the Studio. With 8-way RDMA. Mac Kong. Perhaps Apple will start making server cards again. |
|