| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Nevermark 109 days ago

• Having NPU cores since the M1, would seem to verify that running models has been a game plan for a while. LLMs coming along can only have increased that focus.

• Studios with Ultra Mx, now 4-way RDMA over Thunderbolt 5, and enormous RAM and SSD options, suggest a strong focus. I don't know what else that RAM would be intended for. Four Studio Ultras (total of 360 GPU cores with M5 Ultras?) with 2TB of unified RAM is a local model beast.

• They refashioned their GPU cores to better support both graphic and neural processing, despite already having focused NPU cores.

I would say they have been leaning into local models for several years.

I expect we will see more models being optimized for smaller sizes, as demand for them increases. With hardware performance and neural focus trending up, and model requirements/quality trending down, the next few years will be interesting times.

What would make me happy: Ultra x 2 (i.e. 2xUltra, 4xMax, 8xPro, 16xM5) packaging in the Studio. With 8-way RDMA. Mac Kong. Perhaps Apple will start making server cards again.