| Ya I winced after I wrote that and am considering taking a break from posting for a while. I'm just really tired of suffering the status quo because many people love it if it aligns with their worldview, while people like me who see things a little differently seem condemned to struggle in isolation. I've always admired Tesla, not Edison, but the Edison way of doing business seems to win out and suppress the people on the fringe. As for GPUs, they are useful for languages like GNU Octave (MATLAB) and Julia. Although there's a code smell with that. The matrix languages that would benefit most from SIMD seem to be the ones that aren't hardware accelerated on GPUs. Something has gone wrong: https://news.ycombinator.com/item?id=8302256 That was over 10 years ago and I doubt the situation is any better today. Misaligned incentives. I feel that the missing link is direct hardware access to the GPU ALUs. Without being able to transpile CPU code to GPU code and vice versa to provide bare metal multithreading and vector operations, we miss out on a whole branch of computer science. We can't do the interesting experiments that I always rant about, which would make stuff like genetic algorithms borderline trivial. We're stuck with cookie cutter solutions like OpenGL, Vulkan, Metal, etc. Yet we still praise Nvidia, even though like Microsoft/Intel in the 1990s, they're probably most responsibly for stifling innovation in multicore computing. For what it's worth, you're right about the latency/throughput tradeoff. I don't mind stuff like Intel's integrated graphics or Apple's M line of processors with integrated GPU and NPU cores for raw throughput and efficiency. I just think they're silly because they operate at the wrong layer: 1) CPU 2) multicore 3) GPU. There's no layer 2. Why is that? |
> while people like me who see things a little differently seem condemned to struggle in isolation
Haha, you should consider founder life then. I think many founders feel like this.
The issue with transpiling to GPUs is that it is really hard to do that in a sensible, performant way. Something something a sufficiently smart compiler... There is plenty of @jit in Python, and there's TornadoVM for JVM.
Nvidia's CUDA has been around since the 2000s and is somehow still best way to program for GPUs still. Probably the best SIMT-oriented stack at all, even. From my perspective, it's everyone else stifling themselves.