Hacker News new | ask | show | jobs
by glowcoil 1925 days ago
> > This makes it clear you didn't even perform a cursory investigation of the project

> I did, and mentioned in the docs, here’s a quote: “I didn’t want to experiment with GPU-based splines. AFAIK the research is not there just yet.”

Not what I said. I said that you didn't investigate the project discussed in the original blog post before declaring, in your words, that "the quality is not good" and comparing it to your own library.

Vrmacs and piet-gpu are two totally different types of renderer. Vrmacs draws paths by decomposing them into triangles, rendering them with the GPU rasterizer, and antialiasing edges using screen-space derivatives in the fragment shader. This approach works great for large paths, or paths without too much detail per pixel, but it isn't really able to render small text, or paths with a lot of detail per pixel, with the necessary quality. (Given this and the other factors you mentioned in your reply, rendering text on the CPU with Freetype is a perfectly reasonable engineering choice and I am not criticizing it in the slightest.)

In comparison, piet-gpu decomposes paths into line segments, clips them to pixel boundaries, and analytically computes pixel coverage values using the shoelace formula/Green's theorem, all in compute shaders. This is more similar to what Freetype itself does, and it is perfectly capable of rendering high-quality small text on the GPU, in way that Vrmacs isn't without shelling out to Freetype.

Again, to be clear, I'm not criticizing any of the design choices that went into Vrmacs; it looks like it occupies a sweet spot similar to NanoVG or Dear ImGui, where it can take good advantage of the GPU for performance while still being simple and portable. My only point here is that you performed insufficient investigation of piet-gpu before confidently making an uninformed claim about it and putting it in a somewhat nonsensical comparison with your own project.

1 comments

> in your words, that "the quality is not good"

Oh, you were asking why I said so? Because I have clicked the “notes document” link in the article, the OP used the same tiger test image as me, and that document has a couple of screenshots. And these were the only screenshots I have found. Compare them to screenshots of the same vector image rendered by my library, and you’ll see why I noted about the quality.

> Vrmacs draws paths by decomposing them into triangles, rendering them with the GPU rasterizer, and antialiasing edges using screen-space derivatives in the fragment shader.

More or less, but (a) not always, thin lines are different. (b) that’s a high-level overview but there’re many important details on the lower levels. For instance, “screen-space derivatives of what?” is an interesting question, critically important for correct and uniform stroke widths. The meshes I’m building are rotation-agnostic, and to some extent (but not completely) they are resolution-agnostic too.

> and it is perfectly capable of rendering high-quality small text on the GPU

It is, but the performance overhead is massive, compared to GPU rasterizer rendering these triangles. For real-world vector graphics that doesn’t have too much stuff per pixel that complexity is not needed because triangle meshes are good enough already.

> it looks like it occupies a sweet spot similar to NanoVG

They’re similarities, I have copy-pasted a few text-related things from my fork of NanoVG: https://github.com/Const-me/nanovg/ However, Vrmac delivers much higher quality of 2D vector graphics (VAA, circular arcs, thin strokes, etc), is much faster (meshes are typically reused across frames, I use more than 1 CPU core, and the performance-critical pieces are in C++ manually vectorized with NEON or SSE), and is more compatible (GL support on Windows or OSX is not good, you want D3D or Metal respectively).

The document explains above the tiger image (like, directly above it) that it is a test image meant to evaluate a hypothesis about fragment shader scheduling:

> Update (7 May): I did a test to see which threads in the fragment shader get scheduled to the same SIMD group, and there’s not enough coherence to make this workable. In the image below, all pixels are replaced by their mean in the SIMD group (active thread mask + simd_sum)

I cloned the piet-gpu repository and was able to render a very nice image of the Ghostscript tiger: https://imgur.com/a/swyW0gl

Way better than in the article, but still, I like my results better.

The problematic elements are thin black lines. On your image the lines are aliased, visible for the lines which are close to horizontal but not quite. And for curved thin lines, results in visually non-uniform thickness along the line.

The original piet-metal codebase has a tweak where very thin lines are adjusted to thicker lines with a smaller alpha value, which improves quality there. This has not yet been applied to piet-gpu.

One of the stated research goals [1] of piet-gpu is to innovate quality beyond what is expected of renderers today, including conflation artifacts, resampling filters, careful adjustment of gamma, and other things. I admit the current codebase is not there yet, but I am excited about the possibilities in the approach, much more so than pushing the rasterization pipeline as you are doing.

[1]: https://github.com/linebender/piet-gpu/blob/master/doc/visio...

I have doubts. The reason why rasterizers are so good by now — games been pushing fillrate, triangles count, texture samplers performance and quality for more than a decade.

Looking forward, I’d rather expect practical 2D renderers using the tech made for modern games. Mesh shaders, raytracing, deep learning, and even smaller features like sparse textures. These are the areas where hardware vendors are putting their transistors and research budgets.

None of the features you mentioned is impossible with rasterizers. Hardware MSAA mostly takes care about conflation artifacts, gamma is doable with higher-precision render targets (e.g. Windows requires FP32 support since D3D 11.0).