| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by JonChesterfield 837 days ago
	Yes, though the llvm x64 backend may turn out to be very similar to the upstream one. Existing tools for compiling C to Intel's GPU are this tool, there are no others.

1 comments

PresidentZippy 837 days ago

Is there anything more to it than that? If not, the documentation would be a lot more helpful if it lead with something straight to the point. Here's something that could go directly under the title of the README:

"This compiler consists of a custom LLVM frontend and backend. The backend compiles LLVM IR code into machine code consisting of x86_64 instructions and Intel GPU code. The frontend works in conjunction with the backend to compile C and C++ code with special optimization which when enabled, compiles OpenMP routines into hardware-accelerated code targeting Intel GPUs, FPGAs, or AMD and NVIDIA GPUs."

As someone who only used OpenMP academically, I don't see much of a point in that. In the post C++11 world, where we can write type-safe compile-time code, preprocessor macro definitions should stay in C code.

Until Intel GPUs are at least competitive with the big boys, interop with their products doesn't concern me a whole hell of a lot. I'm not going to plan my scientific computing applications around the integrated graphics found on cheap Wintel consumer devices.

link

JonChesterfield 836 days ago

The docs say it's a proprietary compiler for intel hardware. I'm inclined to believe it on that.

It's worth noting that OpenMP pragmas are a totally different thing to C preprocessor macros. A pragma like 'omp target parallel for' means something like "take the following loop, build a GPU kernel out of it, arrange for data to be copied back and forth and to launch that kernel when control flow gets here, and arrange to link in all the openmp libraries and also run a bunch of compiler optimisations". A macro means "replace these tokens with these other ones".

OpenMP is essentially a really big runtime library dealing with threads, scheduling execution, running code on GPUs and so forth. It is sort-of usable in that form. If you're determined then making calls directly into libomp.so and libomptarget.so will make your will a reality. All the pragma syntax is about transforming application code into a lot of calls into that library with appropriately constructed tables of data. And then the compiler works hard to optimise this, e.g. removing calls that don't need to happen, simplifying others, deduplicating some.

Syntactically OpenMP is a really good fit for Fortran. The invocations look completely appropriate there. For C++, it does tend to upset the sensibilities of the programmers. I personally think it's wildly funny for people who are content with the syntactic horror show of C++ to decide the OpenMP extensions are ugly but there we go, normalisation of deviance and all that.

On a more philosophical level, and what drew me to implementing OpenMP originally, CUDA is a problem. Not only in the vendor lock to nvidia sense - it's also a deeply nasty language to program with. I especially dislike the warp intrinsics - they take a bitmap corresponding to the CFG of your program, which you are supposed to compute manually (across branches, loops and so forth) and pass around into library functions. GPUs are excellent machines and I want to be able to program them in something which is not CUDA.

link

PresidentZippy 836 days ago

Ok, now I'm learnding some interesting and/or valuable shit.

I'm familiar with compiler intrinisics (e.g. __sync_add_and_fetch), but I just assumed (incorrectly) that "#pragma omp_parallel_for" was just a macro that adds pthread API calls into a for loop to create new threads and join when finished.

>Syntactically OpenMP is a really good fit for Fortran.

I can get on board with Fortran for the niche of scientific computing, although again my qualms are with using it in C++. Too many people say they know C++, but then write "C++" code with raw pointers. I don't use C++17 for performance, and most "zero-cost abstractions" are a lie; I use it for type safety. If you buy into the modern C++ way, you'll catch a lot of stuff at compile time that systems programmers using C and web devs using a litany of other weakly or dynamically-typed languages catch in their production environment.

>syntactic horror show of C++

Other than "string" not being a native type, I'd reckon what you really hate is not the syntax itself, but the compiler errors. Granted, if you hire the kind of people who post on Stack Overflow, you can get wacky shit like this:

template<typename Testicle, typename... Diseases> static std::optional<std::tuple<Diseases...>> deeply::nested::namespaced::classes::suck_balls(const Testicle& left_nut, Testicle&& right_nut) noexcept; // Is "classes" a class or another namespace?

But I've had to fix other peoples' spaghetti code in 4 other languages, so I stopped blaming the language many moons ago.

>what drew me to implementing OpenMP originally, CUDA is a problem

Did you try Vulkan Compute, and if so what problems did you run into? 200+ lines of "setup" code, similar to OpenCL programming?

I ask because the entirety of my systems programming career was not speeding up number crunching, but reducing IPC and making things run asynchronously.

link