|
|
|
|
|
by sspiff
185 days ago
|
|
There is no way to achieve a high throughput low latency connection between 25 Strix Halo systems. After accounting for storage and network, there are barely any PCIe lanes left to link two of them together. You might be able to use USB4 but unsure how the latency is for that. |
|
Strix Halo IO topology: https://www.techpowerup.com/cpu-specs/ryzen-ai-max-395.c3994
Frameworks mainboard implements 2 of those PCIe4x4 GPP interfaces as M.2 PHY's which you can use a passive adapter to connect a standard PCIe AIC (like a NIC or DPU) to, and also interestingly exposes that 3rd x4 GPP as a standard x4 length PCIe CEM slot, though the system/case isn't compatible with actually installing a standard PCIe add in card in there without getting hacky with it, especially as it's not an open-ended slot.
You absolutely could slap 1x SSD in there for local storage, and then attach up to 4x RDMA supporting NIC's to a RoCE enabled switch (or Infiniband if you're feeling special) to build out a Strix Halo cluster (and you could do similar with Mac Studio's to be fair). You could get really extra by using a DPU/SmartNIC that allows you to boot from a NVMeoF SAN to leverage all 5 sets of PCIe4x4 for connectivity without any local storage but we're hitting a complexity/cost threshold with that that I doubt most people want to cross. Or if they are willing to cross that threshold, they'd also be looking at other solutions better suited to that that don't require as many workarounds.
Apple's solution is better for a small cluster, both in pure connectivity terms and also with respect to it's memory advantages, but Strix Halo is doable. However, in both cases, scaling up beyond 3 or especially 4 nodes you rapidly enter complexity and cost territory that is better served by nodes that are less restrictive unless you have some very niche reason to use either Mac's (especially non-pro) or Strix Halo specifically.