They're in a different ballback in memory bandwidth. The right comparison is the Ryzen AI Max 395 with 128GB DDR5-8000 which can be bought for around $1800 / 1750€.
$4,000 is actually extremely competitive. Even for an at-home enthusiast setup this price is not our of reach. I was expecting something far higher, that said, nVidia's MSRP is something of a pipe dream recently so we'll see when it's actually released and the availability. Curious also to see how they may scale together.
For this form factor it will be likely ~2 years for the next one based on Vera CPU and whatever GPU. The 50W CPU will probably improve power efficiency.
If SOCAMM2 is used it will still probably be at most near the range of 512/768 GB/s bandwidth, unless LPDDR6X / LPDDR7X or SOCAMM2 is that much better, SOCAMM on the DGX Station is just 384 GB/s w/ LPDDR5X.
Form factor will be neutered for the near future, but will probably retain the highest compute for the form factor.
The only way there will be a difference is if Intel or AMD pump their foot on the gas, which this makes maybe 2/3 years of it, with another 2 years unless they have something cooking it isn't going to happen.
Software driven changes could occur too! Maybe the next model will beat the pants off of this with far inferior hardware. Or maybe itll be so amazing with higher bandwidth hardware that anyone running at less than 500gbs will be left feeling foolish.
Maybe a company is working on something totally different in secret that we cant even imagine. The amount of £ thrown into this space at the moment is enormous.
Based on what data? I'm not denying the possibility but this seems like baseless FUD. We haven't even seen what folks have done with this hardware yet.
Msrp, but try getting your hands on one without a bulk order and/or camping out in a tent all weekend. I have seen people in my area buying pre-biult machines as they often cost less than trying to buy an individual card.
It’s not that hard to come across MSRP 5090s these days. It took me about a week before I found one. But if you don’t want to put any effort or waiting into it, you can buy one of the overpriced OC models right now for $2500.
Still, a PC with a 5090 will give in many cases a much better bang for the buck, except when limited by the slower speed of the main memory.
The greater bandwidth available when accessing the entire 128 GB memory is the only advantage of NVIDIA DGX, while a cheaper PC with discrete GPU has a faster GPU, a faster CPU and a faster local GPU memory.
I’ve been thinking the same… I have jetson Thor and only difference I can imagine is the capability to connect two DGX sparks together… but then I’d rather go for RTX pro 6000 instead of buying two DGX spark units, because I prefer the higher memory bandwidth, more Cuda cores, tensor cores and RT cores over 256 GB memory for my use case.
The jetson thor seems to be quite different. The Thor whitepaper lists 8 TFlop/s of FP32 compute where the DGX sparks seems to be closer to 30 TFlop/s. Also 48 SMs on the Spark vs 20 on the Jetson.
Well, that’s disappointing since the Mac Studio 128GB is $3,499. If Apple happens to launch a Mac Mini with 128GB RAM it would eat Nvidia Sparks’ lunch every day.
Only if it runs CUDA, MLX / Metal isn't comparable as ecosystem.
People that keep pushing for Apple gear tend to forget Apple has decided what industry considers industry standards, proprietary or not, aren't made available on their hardware.
Even if Metal is actually a cool API to program for.
It depends what you're doing. I can get valuable work done with the subset of Torch supported on MPS and I'm grateful for the speed and RAM of modern Mac systems. JAX support is worse but hopefully both continue to develop.
FYI you should have used llama.cpp to do the benchmarks. It performs almost 20x faster than ollama for the gpt-oss-120b model. Here are some samples results on my spark:
Is this the full weight model or quantized version? The GGUFs distributed on Hugging Face labeled as MXFP4 quantization have layers that are quantized to int8 (q8_0) instead of bf16 as suggested by OpenAI.
Example looking at blk.0.attn_k.weight, it's q8_0 amongst other layers: