|
|
|
|
|
by jlebar
3704 days ago
|
|
In fact I think at the moment almost everything, other than the memory-space alias analysis and a few pass tuning tweaks, is in. I know the former will be difficult to land, and I suspect the latter may be as well. I don't have a lot of benchmarks at the moment, so I can't say how important they are. And it of course depends on what you're doing. clang/llvm's CUDA implementation shares most of the backend with gpucc, but it's an entirely new front-end. The front-end works for tensorflow, eigen, and thrust, but I suspect if you try hard enough you'll be able to find something nvcc accepts that we can't compile. At the moment we're pretty focused on making it work well for Tensorflow. |
|