Hacker News new | ask | show | jobs
by alecco 167 days ago
IMHO DGX Spark at $4,000 is a bad deal with only 273 GB/s bandwidth and the compute capacity between a 5070 and a 5070 TI. And with PCIe 5.0 at 64 GB/s it's not such a big difference.

And the 2x 200 GBit/s QSFP... why would you stack a bunch of these? Does anybody actually use them in day-to-day work/research?

I liked the idea until the final specs came out.

1 comments

I think the selling point is the 128GB of unified system memory. With that you can run some interesting models. The 5090 maxes out at 32GB. And they cost about $3000 and more at the moment.
1. /r/localllama unanimously doesn't like the Spark for running models

2. and for CUDA dev it's not worth the crazy price when you can dev on a cheap RTX and then rent a GH or GB server for a couple of days if you need to adjust compatibility and scaling.

I am not on reddit. What are they saying?
It isn't for "running models." Inference workloads like that are faster on a mac studio, if that's the goal. Apple has faster memory.

These devices are for AI R&D. If you need to build models or fine tune them locally they're great.

That said, I run GPT-OSS 120B on mine and it's 'fine'. I spend some time waiting on it, but the fact that I can run such a large model locally at a "reasonable" speed is still kind of impressive to me.

It's REALLY fast for diffusion as well. If you're into image/video generation it's kind of awesome. All that compute really shines when for workloads that aren't memory speed bound.

With a 5070 Ti performance that's a weird choice for R&D as well. You won't be able to train models that require anywhere near 100GB VRAM due to slow processing, and 5070 Ti is under $1k
Yeah, that's mostly fair, but it kind of misses the point. This is a professional tool for AI R&D. Not something that strives to be the cheapest possible option for the homelab. It's fine to use them in the lab, but that's not who they built it for.

If I wanted to I could go on ebay, buy a bunch of parts, build my own system, install my own OS, compile a bunch of junk, tinker with config files for days, and then fire up an extra generator to cope with the 2-4x higher power requirements. For all that work I might save a couple of grand and will be able to actually do less with it. Or... I could just buy a GB10 device and turn it on.

It comes preconfigured to run headless and use the NVIDIA ecosystem. Mine has literally never had a monitor attached to it. NVIDIA has guides and playbooks, preconfigured docker containers, and documentation to get me up and developing in minutes to hours instead of days or weeks. If it breaks I just factory reset it. On top of that it has the added benefit of 200Gbe QSFP networking that would cost $1,500 on it's own. If I decide I need more oomph and want a cluster I just buy another one and connect them, then copy/paste the instructions from NVIDIA.

what workflow/models are you using for media generation?
What’s GH and GB server?
Grace-Hopper and Grace-Blackwell. "Grace" is the integrated CPU+GPU architecture. DGX Spark is GB10 and it's allegedly like a small version of the server GB200.
GH200/GB200, Nvidia’s server hardware