| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by amelius 1491 days ago
	> Accelerated GPU training is enabled using Apple’s Metal Performance Shaders (MPS) as a backend for PyTorch. What do shaders have to do with it? Deep learning is a mature field now, it shouldn't need to borrow compute architecture from the gaming/entertainment field. Anyone else find this disconcerting?

3 comments

dagmx 1491 days ago

Shaders are just the way compute is defined on the GPU.

Why is that concerning to you?

link

WhitneyLand 1491 days ago

It’s not the greatest term even for graphics only.

People new to CG are likely to intuit “shaders” as something related to, well, shading, but vertex shaders et al have nothing to do with the color of a pixel or a polygon.

link

paulmd 1491 days ago

Wait until they learn a kernel has nothing to do with operating systems! And tensor operations have nothing to do with tensor objects! And texture memory often isn't even used for textures!

It's an unfortunate set of terminology due to the way this space evolved from graphics programming - shader cores used to do fixed-function shading! But then people wanted them to be able to run arbitrary shaders and not just fixed-function. And then hey, look at this neat processor, let's run a compute program on it. At first that was "compute shaders" running across graphics APIs, then came CUDA, and later OpenCL. But it is still running on the part of the hardware that provides shading to the graphics pipeline.

Similarly, texture memory actually used to be used for textures, now it is a general-purpose binding that coalesces any type of memory access that has 1D/2D/3D locality.

You kinda just get used to it. Lots of niches have their own lingo that takes some learning. Mathematics is incomprehensible without it, really.

link

my123 1491 days ago

That terminology isn't used at all in GPGPU compute APIs specifically tailored for that purpose, which use quite different programming models where you can mix host and device code in the same program.

And there are "GPUs" today that can't do graphics at all (AMD MI100/MI200 generations) or in a restricted way (Hopper GH100) which has the fixed function pipeline only on two TPCs, for compatibility, but running very slowly due to that.

link

alfalfasprout 1491 days ago

There's absolutely a lot of "graphics" terminology that spills into GPGPU. For example, texture memory in CUDA :) The reality is that GPU's, even the ones that can't output video, are ultimately still using hardware that largely is rooted in gaming. Obviously the underlying architectures for these ML cards are moving away from that (increasingly using more die space for ML related operations) but many of the core components like memory are still shared. It boils down to the fact that at the end of the day they're linear algebra processors.

link

my123 1491 days ago

I'd say that there has been quite some sharing between both back and forth. Evolutions in compute stacks shaped modern graphics APIs too.

Texture units are indeed a part that is useful enough to be exposed to GPGPU compute APIs directly. The "shader" term itself disappeared quite early in those though, as did access to a good part of the FF pipeline including the rasterisers themselves.

link

my123 1491 days ago

Apple doesn't have a separate API tailored towards compute only, but a single unified API that makes concessions to both.

Concessions towards compute: a C++ programming language for device code (totally unlike what's done for most graphics APIs!)

Concessions towards graphics: no single-source programming model at all for example...

link

sudosysgen 1491 days ago

Many GPUs allow you to write device code in C++ via SYCL. It works well enough.

link

geertj 1491 days ago

Not sure if it’s concerning but it caught my eye as well.

link