Hacker News new | ask | show | jobs
by c-smile 2635 days ago
If to speak about UI needs then CPU rasterization is just half of the problem.

320 PPI display (a.k.a. "Retina") has 9 times more pixels than old, classic 96 PPI one.

So just attaching new monitor will require 10 times better CPU in order to render the same UI. That if to use CPU rasterizers only.

Obviously that above is not an option. That's why Direct2D and Skia contain as GPU as CPU rasterizers. That dualism complicates things quite a lot - two alternative renderers under the same API roof shall produce similar results.

So Blend2D, to be a viable solution for the UI, shall be 10 faster in rasterizing than any current alternatives.

Yet, it was NV_path_rendering OpenGL extension from NVIDIA aimed for 2D path rasterization, but it seems the effort is dimmed now as OpenGL itself. OpenGL architecture, that was created to run H/W accelerated full screen apps, is far from being adequate for windowed UI.

So far Microsoft's Direct2D is the best thing that we have for H/W accelerated UI so far. And WARP mode in Direct2D (CPU rasterizer) is pretty close to the Blend2D - they also use JIT for rasterizing AFAIK.

1 comments

It's true that increasing the size of framebuffer demands more from CPU as well. According to my experience a single core on a modern machine has no problem to render real-time into a FullHD framebuffer at high frame rate (depending on the content of course, but UI is fine). This means that multithreaded renderers using 4 threads should be able to render to 4K framebuffer without any issues. Since AMD will release 16c/32t consumer CPUs this year I see no problem on this front as we will have the computational power to run several multithreaded renderers at the same time.

Blend2D has multithreaded rendering on roadmap - I have experience in this topic and everything in Blend2D was designed with multithreading in mind (banding for example). The implementation I'm planning would scale very well.

NV_path_rendering - I haven't seen any detailed comparison to be honest. Frame-rate is not enough to compare CPU vs GPU technology - both memory consumption and power consumption are important as well to calculate frame-rate per watt.

I cannot comment on Direct2D as it's not open source and it runs only on a single operating system. So I don't consider Direct2D as a competition at the moment.

Well, CPU rasterization is always O(N) complex (where N is number of pixels on screen). Multithreading here just adds constant multiplier that according to the math will still lead to O(N).

While GPU rasterization, from application perspective, is near O(1) - does not depend on number of pixels in ideal circumstances.

And having multiple threads to render UI is not desired - there are too many CPU consumers on modern desktop, e.g. online radio that is playing now, etc.

I am not saying that CPU rasterization makes no sense. Quite contrary. As a practical example: in Sciter on Linux/GTK I am using Cairo backend by default as OpenGL inside GTK windows is horrible. So Skia does not help there at all - Cairo and its CPU rasterizer is used.

If we would have something that allows to rasterize paths 5-10 times faster than current Cairo - it will solve all current desktop needs I think.

In principle 192 PPI resolution for desktop monitors of practical sizes (24 inch, 3840x2160 pixels) is OK - human eye will not be able to see separate pixels. Pretty much the same number of pixels is on mobiles ( iPadPro: 2732x2048 ). These are targets that need to be considered.

Practical requirements:

Take HN site in browser. Open it full screen. Decent 2D library should be able to rasterize that amount of text with 60 FPS (e.g. kinetic scrolling).

I don't know what rasterizers you refer to with:

  "CPU rasterization is always O(N) complex (where N is number of pixels on screen)"
But this is definitely not the Blend2D case. I think you will not find rasterizers in production with such properties in software-based 2D rendering as that would be really inefficient. Path boundary matters and that is often the worst case scenario, but Blend2D does much better than this, for example.

I see no problem with multithreading, because it doesn't mean that all CPU cores will be busy with rendering, it means that the total time required to render a frame will be much lower while utilizing CPU power in a more distributed way instead of stressing a single core. You can use 2-4 threads on 8 core machine, for example, leaving the rest for other real-time tasks if required. Applications that use GPU for 2D rendering also use the full power of the GPU, if available.

Single core performance is stagnating while the number of CPU cores is increasing, so it's simply practical to design software to take advantage of that.