| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by blitzar 321 days ago
	> widely-available H100 GPUs Just looked in the parts drawer at home and dont seem to have a $25,000 GPU for some inexplicable reason.

8 comments

Kurtz79 321 days ago

Does it even make sense calling them 'GPUs' (I just checked NVIDIA product page for the H100 and it is indeed so)?

There should be a quicker way to differentiate between 'consumer-grade hardware that is mainly meant to be used for gaming and can also run LLMs inference in a limited way' and 'business-grade hardware whose main purpose is AI training or running inference for LLMs".

blitzar 321 days ago

We are fast approaching the return of the math coprocessor. In fashion they say that trends tend to reappear roughly every two decades, its overdue.

egorfine 321 days ago

Yeah I would love for Nvidia to introduce faster update cycle to their hardware, so that we'll have models like "H201", "H220", etc.

I think it will also make sense to replace "H" with a brand number, sort of like they already do for customer GPUs.

So then maybe one day we'll have a math coprocessor called "Nvidia 80287".

beAbU 321 days ago

I remember the building hugh end workstations for a summer job in the 2000s, where I had to fit Tesla cards in the machines. I don't remember what their device names were, we just called them tesla cards.

"Accelerator card" makes a lot of sense to me.

WithinReason 321 days ago

It's called a tensorcore and it's in most GPUs

genewitch 321 days ago

"GPGPU" was something from over a decade ago; for general purpose GPU computing

hnuser123456 320 days ago

Yeah, Crysis came out in 2007 and could run physics on the GPU.

AlphaSite 321 days ago

I think apple calls them NPUs and Broadcom calls them XPUs. Given they’re basically the number 2 and 3 accelerator manufacturers one of those probably works.

codedokode 321 days ago

By the way I wonder, what has more performance, a $25 000 professional GPU or a bunch of cheaper consumer GPUs costing $25 000 in total?

omneity 321 days ago

Consumer GPUs in theory and by a large margin (10 5090s will eat an H100 lunch with 6 times the bandwidth, 3x VRAM and a relatively similar compute ratio), but your bottleneck is the interconnect and that is intentionally crippled to avoid beowulf GPU clusters eating into their datacenter market.

Last consumer GPU with NVLink was the RTX 3090. Even the workstation-grade GPUs lost it.

https://forums.developer.nvidia.com/t/rtx-a6000-ada-no-more-...

sigbottle 321 days ago

H100s also has custom async WGMMA instructions among other things. From what I understand, at least the async instructions formalize the notion of pipelining, which engineers were already implicitly using because to optimize memory accesses you're effectively trying to overlap them in that kind of optimal parallel manner.

washadjeffmad 320 days ago

I just specify SXM (node) when I want to differentiate from PCIe. We have H100s in both.

addandsubtract 321 days ago

We could call the consumer ones GFX cards, and keep GPU for the matrix multiplying ones.

beAbU 321 days ago

GPU stands for "graphics processing unit" so I'm not sure how your suggestion solves it.

Maybe renaming the device to an MPU, where the M stands for "matrix/math/mips" would make it more semantically correct?

rebolek 321 days ago

I think that G was changed to "general", so now it's "general processing unit".

rpdillon 321 days ago

This doesn't seem to be true at all. It's a highly specialized chip for doing highly parallel operations. There's nothing general about it.

I looked around briefly and could find no evidence that it's been renamed. Do you have a source?

fouc 321 days ago

CPU is already the general (computing) processing unit so that wouldn't make sense

amelius 321 days ago

Well, does it come with graphics connectors?

OliverGuy 321 days ago

Nope, doesn't have any of the required hardware to even process graphics iirc

diggan 321 days ago

Although the RTX Pro 6000 is not consumer-grade, it does come with graphics ports (four Displayports) and does render graphics like a consumer card :) So seems the difference between the segments is becoming smaller, not bigger.

simpleintheory 321 days ago

That’s because it’s intended as a workstation GPU not one used in servers

diggan 321 days ago

Sure, but it still sits in the 'business-grade hardware whose main purpose is AI training or running inference for LLMs" segment parent mentioned, yet have graphics connectors so the only thing I'm saying is that just looking at that won't help you understand what segment the GPU goes into.

dougSF70 321 days ago

With Ollama i got the 20B model running on 8 TitanX cards (2015). Ollama distributed the model so that the 15GB of vram required was split evenly accross the 8 cards. The tok/s were faster than reading speed.

Aurornis 321 days ago

For the price of 8 decade old Titan X cards, someone could pick up a single modern GPU with 16GB or more of RAM.

Aurornis 321 days ago

They’re widely available to rent.

Unless you’re running it 24/7 for multiple years, it’s not going to be cost effective to buy the GPU instead of renting a hosted one.

For personal use you wouldn’t get a recent generation data center card anyway. You’d get something like a Mac Studio or Strix Halo and deal with the slower speed.

varispeed 321 days ago

I rented H100 for training a couple of times and I found that they couldn't do training at all. Same code worked fine on Mac M1 or RTX 5080, but on H100 I was getting completely different results.

So I wonder what I could be doing wrong. In the end I just use RTX 5080 as my models fit neatly in the available RAM.

* by not working at all, I mean the scripts worked, but results were wrong. As if H100 couldn't do maths properly.

philipkiely 321 days ago

This comment made my day ty! Yeah definitely speaking from a datacenter perspective -- fastest piece of hardware I have in the parts drawer is probably my old iPhone 8.

vonneumannstan 321 days ago

>Just looked in the parts drawer at home and dont seem to have a $25,000 GPU for some inexplicable reason.

It just means you CAN buy one if you want, as in they're in stock and "available", not that you can necessarily afford one.

lopuhin 321 days ago

you can rent them for less then $2/h in a lot of places (maybe not in the drawer)

blueboo 321 days ago

You might find $2.50 in change to use one for an hour though

KolmogorovComp 321 days ago

available != cheap

blitzar 321 days ago

available /əˈveɪləbl/

adjective: available

able to be used or obtained; at someone's disposal

swexbe 321 days ago

You can rent one from most cloud providers for a few bucks an hour.

koakuma-chan 321 days ago

Might as well just use openai api

ekianjo 321 days ago

thats not the same thing at all

poly2it 321 days ago

That depends on your intentions.