|
|
|
|
|
by pavanky
4229 days ago
|
|
> As a side note, it seems odd to me that "native CPU" is a target distinct from OpenCL, which already runs on both CPUs and GPUs. We are planning to move towards a single library that dynamically loads the appropriate backend depending on the runtimes / drivers available. If we completely relied on OpenCL, the same binary will not work on machines without the OpenCL SDKs installed. > I think the product is in a tough position because most of the action these is going towards "Big Data," where data doesn't fit on a single machine -- let alone a GPU -- or towards heavy number-crunching, where hand-rolled kernels will outperform generic array libraries Well that is two part question. As for hand-rolled kernels, they will obviously be better if you know the problem type. But more often than not, our users are happy to get "X" times the speed up in "Y" hours as opposed to "(1.2 - 1.3)X" speedup in "(3-5)Y" hours. As for Big data, this is something we are working on / towards. We have some ideas that will make scaling across multiple GPUs and multiple machines easier. Since we will be doing this publicly, I am sure we will get a lot of valuable feedback from the community. |
|