Hacker News new | ask | show | jobs
by formerly_proven 801 days ago
The price is low because they’re useless (except for replacing dead cards in a DGX), if you had a 40$ PCIe AIC-to-SXM adapter, the price would go up a lot.

> I'm one of those people who finds 'retro-super-computing' a cool hobby and thus the interfaces like OAM being open means that these devices may actually have a life for hobbyists in 8~10 years instead of being sent directly to the bins due to secret interfaces and obfuscated backplane specifications.

Very cool hobby. It’s also unfortunate how stringent e-waste rules lead to so much perfectly fine hardware to be scrapped. And how the remainder is typically pulled apart to the board / module level for spares. Makes it very unlikely to stumble over more or less complete-ish systems.

1 comments

I'm not sure the prices would go up that much. What would anyone buy that card for?

Yes, it has a decent memory bandwidth (~750 GB/s) and it runs CUDA. But it only has 16 GB and doesn't support tensor cores or low precision floats. It's in a weird place.

Scientific computing would buy it up like hot cakes.
Only if the specific workload needs FP64 (4.5 Tflop/s), the 9 Tflop/s for FP32 can be had for cheap with Turing or Ampere consumer cards.

Still, your point stands. It's crazy how that 2016 GPU has two thirds the FP32 power of this new 2024 unobtanium card and infinitely more FP64.

Somewhat off topic:

Is there a similar "magic value card" for low memory (2GB?) 8-bit LLMs?

Since memory is the expensive bit, surely there are low cost low memory models?

I believe that's what tenstorrent is aiming for.
The main offer of Tenstorrent goes into server racks and is designed to form clusters.

Standalone cards are more like dev kits.

(I’ve been tracking Tenstorrent for 3+ years and currently have Grayskull in ML test rig together with 3090)

IDK, is it really that much more powerful than the P40, which is already fairly cheap?
The P100 has amazing double precision (FP64) flops (due to a 1:2 FP ratio that got nixed on all other cards) and a higher memory bandwidth which made it a really standout GPU for scientific computing applications. Computational Fluid Dynamics, etc.

The P40 was aimed at the image and video cloud processing market I think, and thus the GDDR ram instead of HBM, so it got more VRAM but at much less bandwidth.

Well, the p40 has 24gb VRAM, which makes it the perfect hobbyist card for a llm, assuming you can keep it cool.
The pci-e p100 is has 16gb vram and won’t go below 160 dollars. Prices for these things would pick up if you could put them in some sort of pcie adapter