|
|
|
|
|
by mattpharr
3081 days ago
|
|
IMHO, the problem with the device model is that it imposes a bunch of unnecessary overhead on the programmer for cases where memory is shared and you're running on the same processor. If I just want to call a function, pass some parameters, have it do some work, and get a result, things like OpenCL require all sorts of annoying boilerplate just to pass parameter values, map buffers, copy results out, etc. Sure it's all straightforward to write, but it's friction, and it's annoying. Regarding clGetProgramInfo: does that return actual native executable code or IR? (I assume it's free to do either but in practice returns the latter, and that there's the usual "final driver compiler" between that code and what runs on the hardware, but I don't know.) An issue with that is that you can't be sure of what will run on users' systems; you're at the mercy of the version of the driver they've got installed. |
|
Agree. It's quite a lot of work to orchestrate even a simple kernel.
> I assume it's free to do either
Looks that way - it seems AMD's engine lets you configure it. There are bunch of 'non-native' representations:
* the OpenCL C source itself (which may end up getting stored in the ELF)
* LLVM IR
* AMDIL (based on LLVM IR but not identical)
* HSAIL (again, like LLVM IR but not identical)
* SPIR (yet again, except that later versions of this IR aren't directly based on LLVM)
http://openwall.info/wiki/john/development/AMD-IL
The poorly-documented "-fbin-exe" flag gets you the real native code.
http://developer.amd.com/wordpress/media/2013/07/AMD_Acceler...
I believe there's a way to get it to build for GPUs other than your own. Whether it's exposed through the API, I'm not sure, but I'm fairly sure it can be done with the dev tools.
(That took quite a bit of digging, which I suppose proves your point.)