| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by wujingyue 3704 days ago

Thanks for your interest, and hope you like it!

Yes, it is currently incomplete, but I'd say at least 80% of the optimizations are upstreamed already. Also, folks in the LLVM community are actively working on that. For example, Justin Lebar recently pushed http://reviews.llvm.org/D18626 that added the speculative execution pass to -O3.

Regarding performance, one thing worth noting is that missing one optimization does not necessarily cause significant slowdown on the benchmarks you care about. For example, the memory-space alias analysis only noticeably affects one benchmark in the Rodinia benchmark suite.

Regarding your second question, the short answer is no. The Clang/LLVM version uses a different architecture (as mentioned in http://wujingyue.com/docs/gpucc-talk.pdf) from the internal version. The LLVM version offers better functionality and compilation time, and is much easier to maintain and improve in the future. It would cost even more effort to upstream the internal version than to make all optimizations work with the new architecture.

2 comments

jlebar 3704 days ago

In fact I think at the moment almost everything, other than the memory-space alias analysis and a few pass tuning tweaks, is in. I know the former will be difficult to land, and I suspect the latter may be as well.

I don't have a lot of benchmarks at the moment, so I can't say how important they are. And it of course depends on what you're doing.

clang/llvm's CUDA implementation shares most of the backend with gpucc, but it's an entirely new front-end. The front-end works for tensorflow, eigen, and thrust, but I suspect if you try hard enough you'll be able to find something nvcc accepts that we can't compile. At the moment we're pretty focused on making it work well for Tensorflow.

link

svensken 3704 days ago

Thanks for the clarification! It's always a pleasure to get a direct response from the first author on something as awesome as this.

I'm definitely subscribing to the llvm-dev list[1] in case any discussion on this continues there. There's also the llvm-commits, clang-dev, and clang-commits lists as well, but llvm-dev kinda seems like the right place for this.

Gpucc in LLVM is definitely a breath of fresh air for all of us nvcc users. To get to see some compiler internals for cuda, it feels like Christmas. A big thanks from me for all the upstreaming effort!

1: http://lists.llvm.org/mailman/listinfo/llvm-dev

link