Blend2D uses dense cell-buffer similarly to font-rs, however, it works quite differently and this difference allows Blend2D to be efficient even when rendering large paths:
- Dense cell buffer, 32-bit integer per one cell (FreeType/Qt use sparse cell buffer and two 32-bit integers per cell, font-rs uses float if I'm not mistaken)
- Blend2D builds edges before rasterization step, edge builder is optimized and performs clipping and curve flattening
- Rasterization and composition happens in bands, thus storage required by the cell buffer is quite small (currently band is 32 scanlines, but we will make it adjustable based on width)
- To complement dense cell buffer Blend2D uses also a shadow bit-buffer to mark cells that were altered by the rasterizer
- Compositor scans bit-buffer instead of cell-buffer to quickly skip areas that were not touched by the rasterizer
- Compositor is SIMD optimized and calculates multiple masks at the same time (at the moment it works at 4 pixels at a time, but this can be extended easily to 8 and 16 pixels)
- Compositor clears both cell-buffer and shadow bit-buffer during composition so when the compositor finishes these buffers are zeroed and ready for another band
- Blend2D maintains small zeroed memory pool that is used to quickly allocate cell and bit buffers when you create the rendering context and returned to the pool when you destroy it
There are probably more differences, like parametrization of NonZero and EvenOdd fill rules, etc, but these are really implementation details to minimize the number of pipelines generated by common rendering operations.
The advantage of font-rs is rendering small paths, large paths will have increasing overhead as a lot of computations would happen on zero cells. Blend2D rasterizer is universal and was tuned to perform well from small to 4K framebuffers.
When I was designing Blend2D's rasterizer I wrote around 20 rasterizers and benchmarked them against each other in various resolutions. I had similar rasterizers like font-rs (but not using floats) and these were only competing in very small resolutions like 8x8 and 16x16 pixels. When shifting to larger resolutions these rasterizers would always lose as shadow bit-buffer scan is much quicker than going through zero cells, especially if you do pixel compositing.
There are demo samples in bl-samples-qt repository that render animated content and have Blend2D and Qt rendering options. You can check them out to see how the rasterizer competes against Qt.
It does, thanks! This does indeed look like good stuff. It is true that font-rs is not fully optimized for large areas, as it depends on memory fill and cumulative sum for each pixel. These are not hugely expensive, but it sounds like a sophisticated approach (as taken by Blend2D) can do better. Happy to see the work.
- Dense cell buffer, 32-bit integer per one cell (FreeType/Qt use sparse cell buffer and two 32-bit integers per cell, font-rs uses float if I'm not mistaken)
- Blend2D builds edges before rasterization step, edge builder is optimized and performs clipping and curve flattening
- Rasterization and composition happens in bands, thus storage required by the cell buffer is quite small (currently band is 32 scanlines, but we will make it adjustable based on width)
- To complement dense cell buffer Blend2D uses also a shadow bit-buffer to mark cells that were altered by the rasterizer
- Compositor scans bit-buffer instead of cell-buffer to quickly skip areas that were not touched by the rasterizer
- Compositor is SIMD optimized and calculates multiple masks at the same time (at the moment it works at 4 pixels at a time, but this can be extended easily to 8 and 16 pixels)
- Compositor clears both cell-buffer and shadow bit-buffer during composition so when the compositor finishes these buffers are zeroed and ready for another band
- Blend2D maintains small zeroed memory pool that is used to quickly allocate cell and bit buffers when you create the rendering context and returned to the pool when you destroy it
There are probably more differences, like parametrization of NonZero and EvenOdd fill rules, etc, but these are really implementation details to minimize the number of pipelines generated by common rendering operations.
The advantage of font-rs is rendering small paths, large paths will have increasing overhead as a lot of computations would happen on zero cells. Blend2D rasterizer is universal and was tuned to perform well from small to 4K framebuffers.
When I was designing Blend2D's rasterizer I wrote around 20 rasterizers and benchmarked them against each other in various resolutions. I had similar rasterizers like font-rs (but not using floats) and these were only competing in very small resolutions like 8x8 and 16x16 pixels. When shifting to larger resolutions these rasterizers would always lose as shadow bit-buffer scan is much quicker than going through zero cells, especially if you do pixel compositing.
There are demo samples in bl-samples-qt repository that render animated content and have Blend2D and Qt rendering options. You can check them out to see how the rasterizer competes against Qt.
Let me know if that answers your question.