Hacker News new | ask | show | jobs
by bitL 3141 days ago
Intel + AMD should get together in offering a CUDA alternative/compatibility for Deep (Reinforcement) Learning/AI where NVidia is experiencing exponential growth for the past few years and they are simply non-existing there full of half-baked efforts.
7 comments

AMD doesn't care about Deep Learning.

This is a quote:

"Are we afraid of our competitors? No, we're completely unafraid of our competitors," said Taylor. "For the most part, because—in the case of Nvidia—they don't appear to care that much about VR. And in the case of the dollars spent on R&D, they seem to be very happy doing stuff in the car industry, and long may that continue—good luck to them. We're spending our dollars in the areas we're focused on."

"Car stuff" being self-driving cars, while "the areas we're focused on" is VR. From http://arstechnica.co.uk/gadgets/2016/04/amd-focusing-on-vr-...

AMD has made numerous press releases about supporting deep learning, sometimes via OpenCL, sometimes via cross compiling CUDA or sometimes something else.

I used to get excited about it.

Now, I have a rule: don't get excited about AMD (or Intel, or any new hardware) until they are winning at training neural networks on an absolute speed basis. (Note: vendors will frequently release benchmarks showing how they beat Nvidia. Almost inevitably these are for inference, and often on a speed per watt or speed per dollar, or you can't actually buy the hardware.)

AMD and Google IIRC are already working on a CUDA implementation for AMD GPUs.
How much do I need to wait until TensorFlow can run stable and at the same speed as on cuDNN on AMD hardware? I can't even contemplate buying AMD right now (gaming is not very important to me).
AMD's performance deficit is about to get a lot worse. Nvidia's upcoming Volta architecture is massively optimised for deep learning - they're touting a 12x performance increase for training and 6x for inferencing over Pascal.

I think Intel have a better chance of catching up with Nvidia at this stage. They've been on an acquisition spree and have picked up a huge amount of DL-related IP. They have immense R&D and fab resources at their disposal.

https://wccftech.com/nvidia-volta-tesla-v100-gpu-compute-ben...

That's only if you use Volta's Tensor cores however.

AMD's 16-bit packed performance with Vega is more than respectable vs NVidia's 16-bit packed performance in Pascal.

In the future, all AMD needs to catch up to Volta's Tensor cores is to build Tensor cores themselves. That doesn't seem like a major technical hurdle. I'm fairly certain that Google would be the primary patent holder on Tensor-cores.

Google open sourced CUDA support for Clang/LLVM, which has been upstream a while now (years) and is kept up to date with various CUDA versions. This has not been a collaboration with AMD.

I haven't kept up to date on what AMD has done with that work, but i believe they use it as part of their compatibility story.

This. CUDA gives Nvidia a huge advantage
AMD has HIP. HIP allows developers to convert CUDA code to portable C++. The same source code can be compiled to run on NVIDIA or AMD GPUs https://github.com/ROCm-Developer-Tools/HIP
> a CUDA alternative

this is what OpenCL is, afaik.

Apples to oranges. The two APIs aren’t comparable, and being forced to use OpenCL is a hindrance itself. But unfortunately it has “Open” in the name so it must be better than that proprietary single vendor CUDA nonsense... /s
...a very underwhelming one when it comes to Deep Learning
Well mainly because Nvidia gimped their OpenCL drivers and there was no customer pushback.
OpenCL vendors never provided a competing tooling alternative to CUDA.

Drivers are only part of the story.

Even Google preferred to create their own Renderscript dialect than supporting OpenCL.

That would still be fine if OpenCL was similarly-performing on Intel/AMD and a 1st class citizen of various Deep Learning frameworks. OpenCL is just an afterthought there sadly.
They could leverage OpenCL and they both already have implementations.
Then they better provide SysCL and sys backends to modern languages.
Intel and AMD had the opportunity in 2014 but they made a series of bad decisions that has put them waaaay far behind. Nervana was a bad move for intel and they've figured it out by now. AMD is deeply mismanaged and shareholders should lobby for change. OpenCL has been such a missed opportunity.

Nvidia will be hard to beat; they're going to be the next Intel.

> AMD is deeply mismanaged and shareholders should lobby for change

I don't know about that... Sure they've had to liquidate some things but that made them survive long for Ryzen and Vega to come out which has enabled them to claw their way out of the red and into the black for the first time in forever.

PS: Am I right to assume that by "bad decisions" you mean things like the "Bulldozer" architecture with that wonky resources-shared-between-cores-thing?

By "bad decisions" I mean they completely overlooked the enterprise market and machine learning when it should have been obvious. Ryzen and Vega have such small profit margins compared to Nvidia's enterprise lineup. And that's their other problem: they keep chasing low value markets. AMD's life in the black will be short lived. Maybe Intel will just end up buying them, but then what's in it for intel?
Intel most likely would not be allowed to buy AMD, something something monopoly, although a real lawyer could probably tell you more.
> OpenCL has been such a missed opportunity.

Why? It’s really a shit api, hard to program for and hard to make efficient. I think AMDs problem is that they didn’t make their own tooling pipeline (and then hopefully make an open standard out of it). Instead they stuck with a crappy open standard with poor tooling because, well, it was the standard.

I should be more specific: Their strategy around getting people to use their cards for GPGPU is a missed opportunity, their OpenCL strategy being a component of that.
Ah, agreed.
This is the whole point, CUDA only got this far, because OpenCL sucks forcing everyone to use a C dialect.

They had to be beaten to finally start proving a bytecode format similar to PTX, for multiple languages, while accepting that most researchers want to use C++ or migrate their Fortran code into GPUs.

OpenCL 2.1 was released two years ago and introduced a C++ version of the kernel language.
AMD only supports OpenCL 2.0 today.

So practically speaking, OpenCL 2.0 is the best you can get, unless you want to run on Intel's iGPUs or Intel's AVX 512 on their CPUs.

AMD does have support of C++ in OpenCL 1.2 as an optional extension, but their support of C++ in OpenCL 1.2 doesn't work when you enable OpenCL 2.0 for some reason. Also, the CodeXL debugger only works on OpenCL 1.2 at the moment...

As far as I can tell, AMD's implementation of OpenCL 2.0 is still early stage. Its fine if you're cool with debugging with "printf" statements.

I know, most of the drivers still don't support it properly after two years, and there are no good debuggers available.

Meanwhile on the CUDA C++ side,

"Designing (New) C++ Hardware” - https://www.youtube.com/watch?v=86seb-iZCnI

That was an interesting talk.