Hacker News new | ask | show | jobs
Let's Fix OpenGL [pdf] (cs.cornell.edu)
103 points by mad 3353 days ago
11 comments

I'm not an OpenGL expert or anything, but I get the impression the author doesn't really know what he's talking about and sounds a bit amateurish (I'm a bit hesitant to say that given it's a professor at Cornell..)

He seems to just hand-wave and says "well just use C you idiots". He criticizes Metal for using a version of C++14 that doesn't allow recursion, but offers no alternate solutions

The reason GLSL isn't C is because you can't do everything C does on a low end cellphone GPU - so obviously you have to restrict the language. The "cognitive load" of knowing the restrictions is overblown but also unavoidable.

SPIRV isn't even mentioned. DSL's are dismissed.

I didn't really follow what he didn't like about the preset input/output variables that are predefined in the different shaders. It's a bit ugly.. but that's also pretty much a non issue.

> The reason GLSL isn't C is because you can't do everything C does on a low end cellphone GPU - so obviously you have to restrict the language.

GLSL limitations has nothing to do with lowend devices and performance (which now is becoming irrelevant compared to desktops).

But rather GLSL is executed on the GPU— typically across many stream processors— this type of parallelization makes using a general purpose language like C difficult.

I think you need to familiar yourself with some shader programming at least, before you critique the author's domain.

I think he is familiar. The "you can't do" comment really means "you can't do with good performance." Typical C code would not run fast on a GPU. Indirect data (pointer chasing) has /extreme/ latency penalty!
As I understand it, GPU ISAs expose all of the functionality required to implement arbitrary programs, but the actual shaders compilers don't use it, due to performance concerns.
This is the author's position: "This essay introduces OpenGL and its pitfalls for the PL-minded reader and advocates for more research that applies language ideas to this underserved domain. "

> He criticizes Metal for using a version of C++14 that doesn't allow recursion, but offers no alternate solutions

No, he criticizes GLSL and HLSL for extending C or C++ in an ad-hoc way which make them buggy and unreliable (myriad of small differences between implementations).

"The need for ad hoc language extension also complicates compiler implementations: current compilers either need to reinvent a complete C-like parser and compiler from scratch [28] or hack an existing frontend such as GCC or Clang. Both approaches are error prone: [...]"

Someone being a professor doesn't correspond with them necessarily being a good engineer or expert on an engineering subject. Indeed, as people have limited time to spend on things they are interested in, satisfying the criterion for becoming a professor seems to often mean that you would not satisfy the criterion for being a good engineer -- time that you could have spent getting your hands dirty or gaining experience was spent doing other things, like writing papers on very specific subjects.

This is just my observation after dealing with people in academia and industry, however, so my words should be taken with a huge amount of salt. I think it's unfortunate that it's in PL where this divide tends to be most harmful.

I think you've misunderstood the author's position. I don't think he's arguing for C at all. Reading between the lines the author is arguing language agnostic APIs, which would allow modern languages to be used to write GPU code.
Which already exists called SPIR-V.
So SPIR-V or generating code for each ISA directly?
> He seems to just hand-wave and says "well just use C you idiots". He criticizes Metal for using a version of C++14 that doesn't allow recursion, but offers no alternate solutions

I don't think the author suggested anything like that. He recommended using C with some extensions, so it would be easy to write shaders in many different languages. Also, he criticized the poor implementation of Metal's shading language which requires a custom fork of Clang (which is apparently buggy), not necessarily the choice of using C++14.

> The reason GLSL isn't C is because you can't do everything C does on a low end cellphone GPU - so obviously you have to restrict the language.

That may be true in the early days of OpenGL, but now that's probably no longer the case.

it most definitely is still that case. plenty of GPUs can not run plain C.

looking at the metal language I'm not sure I'd personally call that C++ given all the restrictions

https://developer.apple.com/metal/metal-shading-language-spe...

I'd be curious if you can create static arrays, strings or static strings, do string compares etc. not that I'd want to do any of that on a GPU but rather if you tell me the language is C++ there are a ton of things I'd expect to be able to do that I'm pretty sure don't translate to GPUs

maybe I want to declare a local static array and modify it. maybe I want to create static arrays of colors and search for the closest match. I feel like having a more common language would just lead people to bang their heads against the impossible or implement horribly inefficient algos not really understanding what's going on under the hood

"I feel like having a more common language would just lead people to bang their heads against the impossible or implement horribly inefficient algos not really understanding what's going on under the hood"

I completely disagree. The author is not really arguing for a "more common language" anyhow - he's arguing for a GPU language that avoids many of the inefficiencies and semantic problems with the current approach. To boot, he's arguing for an approach that allows better static error checking, which it's hard to argue against.

I have to say it's sad that Apple went with Metal rather than working to evolve Vulkan as an excellent standard. We'll see how that works out in the long run...

> I feel like having a more common language would just lead people to bang their heads against the impossible or implement horribly inefficient algos not really understanding what's going on under the hood

Good point.

> The reason GLSL isn't C is because you can't do everything C does on a low end cellphone GPU - so obviously you have to restrict the language.

Yes, but why create another language that looks like C but behaves subtly differently? Could the compiler not simply reject usage of C features which are problematic for GPUs?

This author seems pretty clueless about how things actually work and a lot of the assumptions are just plain wrong. For example, ubershaders are preferred instead of many small shaders because the overhead of switching shaders and recompiling them is expensive. It is not something that people build because it is convenient (rather it is quite the opposite!) and then specialize with fixed parameters.
As someone who's a fan of PL languages(and spends a fair amount of time in the GPU space) I'm not sure I buy many of the arguments.

The reason that GPU drivers/APIs have few safety checks is that in graphics code, performance is valued above all else. Even simple calls can introduce overhead that's undesirable when you're making thousands of the same type of calls.

His example of baked shaders doesn't really seem to hold much value since interactive shader builders(ShaderToy, UE3/4, etc) are all content driven anyway so the extra code generation isn't a limiting constraint.

Nice effort but I don't see it solving actual pain points in production.

Safety checks can have performance penality but that isn't necessarily impossible to optimize. Perhaps a new language is required? Can't we have a langauage for GPU with the same promises Rust can solve above C and C++?
If performance was valued above all else in a narrow sense, OpenGL would not be used and people would program GPUs natively.

In a wide sense that includes programmer productivity and end product robustness in the equation, well, safety checks sell themselves.

Bounds checking is very cheap on CPUs, but even more so on GPUs because GPUs are more rarely bottlenecked on simple local arithmetic.

>If performance was valued above all else in a narrow sense, OpenGL would not be used and people would program GPUs natively.

People do program GPU natively on game consoles. However, with so many GPUs in use (at least three major manufacturers each with multiple architectures just for desktop) it's impractical to write native graphics code. It's exactly the same concern as people have for CPU programming - it would be the best to write ISA code but given the variety of available CPUs the C/C++ is the best compromise between performance and usability. All attempts to push "better" language with forced run-time costs have failed to displace it.

>Bounds checking is very cheap on CPUs, but even more so on GPUs because GPUs are more rarely bottlenecked on simple local arithmetic.

Good news. GPUs do bounds checking in hardware already. The errors people want OpenGL to find are not about out-of-bounds but mostly of "I want to draw this but instead it's drawing that, what to do?" kind, caused by the complex state and its dynamic nature.

John Carmack weighed in on Twitter:

> ...some interesting thoughts, but the shading language is the least broken part of OpenGL.

> Lots of people consider automating the computation rate determination between fragment and vertex shaders, but it is a terrible idea.

https://twitter.com/ID_AA_Carmack/status/851258064909070336

A concrete problem that the author misses is the need for a better understanding of SPMD semantics. GLSL has the notion of "dynamically uniform" values, i.e. values that are the same across all shader threads that arise from one draw call, but this notion isn't really properly defined anywhere. It involves an unholy mixture of data flow and control flow that doesn't seem to appear anywhere else in PL theory.

Stuff kind of just works because GLSL doesn't have unstructured control-flow (i.e., there's no goto), and people have a mental model of what the hardware actually does and use that for the semantics.

But a proper study of those semantics, and how to carry it over to unstructured control-flow, or to what extent it is possible, would be awesome.

> Potential solutions. Shader languages’ needs are not distinct enough from ordinary imperative programming languages to warrant ground-up domain-specific designs. They should should instead be implemented as extensions to general-purpose programming languages. There is a rich literature on language extensibility [27, 36, 39] that could let implementations add shader-specific functionality, such as vector operations, to ordinary languages.

I like this part.

Apple's Metal shader language is C++14 with some restrictions, and extensions (such as native matrix and vector types), implemented with LLVM:

https://developer.apple.com/metal/metal-shading-language-spe...

Microsoft's HLSL is now also based on LLVM:

https://blogs.msdn.microsoft.com/directx/2017/01/23/new-dire...

Khronos has an LLVM-based translator between LLVM bitcode and SPIR-V:

https://github.com/KhronosGroup/SPIRV-LLVM

So things are converging, may be one day GPU extensions will end up in the C++ standard ;)

Sadly Khronos' LLVM only supports OpenCL i.e. compute SPIRV code generation, is rather old (3.6.1) and is nowhere near in shape enough to be upstreamed, and MS's in only slightly newer at 3.7, the the repo also contains a custom clang and so would require some git surgery to get it upstream once stable. Apple's metal I don't think is even open source.

However I have a an up to date fork of LLVM 5 (https://github.com/thewilsonator/llvm) that has Khronos' changes cleaned up a bit, i.e. the spirv triples are actually targets, but once I make the SPIRV OpenCL extension operations proper LLVM intrinsics instead of mangled C++ (and delete all the associated mangling code) then there is no reason that can't go upstream.

I don't think SPIRV graphics support will be all that difficult to add once the intrinsics nonsense is fixed.

> So things are converging, may be one day GPU extensions will end up in the C++ standard ;)

Not before D gets them ;) (Shameless plug: I will be speaking at DConf about this.)

This part is kind of addressed by having things like SPIR-V - shaders are defined with "bytecode", which itself is produced by a separate compiler. This makes things easier for language designers and driver developers. I don't see a reason why same approach couldn't be implemented in OpenGL. In fact, it would be awesome to have a GL extension for SPIR-V that just adds a corresponding "spirv" bninary format to glShaderBinary
You mean the GL_SHADER_BINARY_FORMAT_SPIR_V_ARB format, which is part of the GL_ARB_gl_spirv extension?
Huh, I didn't know that extension existed! :)
A nice illustration of half the issues with OpenGL.
Shaders are SIMT (single instruction multiple threads).
The abstraction and data layout cost penalties are extremely different in a GPU than on a general purpose CPU.
I wonder how such properly-designed extensions would look like in Common Lisp or any other language with similar metaprogramming capabilities.
As someone who recently start learning to program GPUs, I enjoyed this read. I particularly find the concept of a linear algebra-aware type system compelling. I love the idea of the type system statically checking that I'm performing operation in and between correct vector spaces. Is the fact that Vulkan uses SPIR-V sufficient to support creation of languages to allow this to be implemented?
If all you want is more static checking, it doesn't much matter if the graphics API consumes GLSL or SPIR-V. SPIR-V is mostly about factoring out part of the optimization phase from the driver to the compilation process. This speeds up the process of loading a shader, and more importantly lets developers control aspects of optimization they couldn't before.
Flicking through the article made me wonder - would it be possible to have something like ACID tests for OpenGL?
https://people.freedesktop.org/~nh/piglit/ is an extensive open conformance test suite for opengl. Khronos has their own (https://github.com/KhronosGroup/VK-GL-CTS, used to be proprietary), but judging by the amount of bugs and piglit failures in drivers which supposedly pass it, I think it's probably not very useful.
Be aware that the open-source GL tests in vk-gl-cts still contain quite a lot of bugs from porting from the old framework that was used in proprietary test suite releases.

Piglit isn't really a conformance test suite. It is a test suite, is usually extended when people add new features to Mesa, and collects regression tests over time. However, it actually started out in part as a way to modify glean tests that drivers couldn't pass because the hardware was lacking, for example, hardware didn't have enough precision in the blender... those were the days. The p in piglit stands for your choice of pragmatic or practical. I'm not sure if that's actually documented anywhere, but as the original author, I should know ;-)

I'm sure piglit has bugs, but it's also well known that certain closed source drivers have less than conformant GLSL compilers, for example - so the fact that drivers passing the Khronos conformance tests fail piglit tests in itself doesn't mean anything other than shit needs investigating, and occasionally needs spec clarifications.

Fair enough, and good to hear from you :) piglit made my life better when I was doing mesa work.
Yup and Google has dEQP for android
dEQP seems to be included in Khronos', judging by a quick glance at the code and the READMEs.
There was a pixel-accurate test suite for OpenGL 1,0 from Kronos,[1] but it predates programmable shaders. It tests the geometry and texture processing. It's not free.

[1] https://www.khronos.org/opengl/adopters/

I don't think I fully understand the point about metaprogramming facilities. Sure, it would be nice to have compile-time ifs that get eliminated from the generated code if the condition isn't met. But I don't think this necessarily solves the problem of "combinatoric explosion" of different shader variants - you still have to generate a separate chunk of code for each possible combination of compile-time conditions. Unless a corresponding change is proposed at the level of shader bytecode (SPIR-V), which probably leads to opening a can of worms.
Im fairly certain shaders can and do eliminate compile time if statements, you can observer this by seeing which uniforms are defined (for example in WebGL).
You mean if we write

if (some_constant_expression) { //...code... }

and some_constant_expression evaluates to false, the entire code block will be stripped from the result?

That may be true, but it's not standardized behavior, is it?

True for many compilers for many years. Its a way to integrate 'conditional compilation' into the normal program flow. It can make the code easier (or harder) to read. Use judiciously.
While I agree with the points about type safety with uniforms/attributes, I've found that this class of bugs didn't happen to me all that often in practice. The bug that happens way more often is calculations in the wrong coordinate system (or using two values in different coordinate systems in the same calculation), which the author also points out.
I find it hard to test opengl. That would be my first approach to fix it: provide a good way to debug and test the compiled programs.
As far as debugging goes, use KHR_debug extension, as well as RenderDoc or similar tools. As far as (automated) testing goes, I don't really know any solutions there beyond a "fuzzy" comparison to a set of "known good" images. But the problem is, this only works for regression testing. When implementing something new (say, you want to add reflections to your engine), there isn't anything to compare to, you just have to actually look at it.
Eh, you can already debug kernels on a line-stepping level. What exactly do you need more?
That sounds pretty awesome to me. What tool should I be using to do that for Intel Haswell on Linux?
CodeXL can do that (at least on Windows)

I think VTune can do OpenCL analyses, I don't know whether it has any graphics debugging capability (quite a lot of features are Windows-only or annoying to set up on Linux [because who needs a kernel with a stable ABI?], including managed analyses [1] :(

The simple truth here is that most development happens on Windows and thus most development tools are developed for Windows. This is almost universally true, with some notable exceptions (e.g. Valgrind can be a killer app; nodejs is so unportable that it requires a new Windows subsystem to run).

[1] That is, VTune detects whether and which invasive runtime you are using and is able to descend into and analyse what the runtime is doing. E.g. with a Python managed analysis VTune replaces all these recursive gibberish calls like PyEval_EvalFrameEx with the Python traces that it's executing. When I first used that mind=blown. Sadly, Windows only. But even then VTune is, without consciously trying to make this comment an advert, one of the best (if not the best) performance analysis tools I've ever used.

You would probably need to use a software renderer for that, I imagine.