Hacker News new | ask | show | jobs
by arghwhat 282 days ago
The problem with these kinds of blur effects is not the cost of a gaussian blur (this isn't gaussian blur anyway as it has a lens effect near the edges). It's damage propagation and pipeline stalls.

When you have a frosted glass overlay, any pixel change anywhere near the overlay (not just directly underneath) requires the whole overlay to be redrawn, and this is stalled waiting for the entire previous render pass to complete first for the pixels to be valid to read.

The GPU isn't busy in any of this. But it has to stay awake notably longer, which is the worst possible sin when it comes to power efficiency and heat management.

2 comments

Yes, that all makes sense! My understanding is that the damage propagation gets worse with depth (no limit) in addition to breadth (screen size). If the compositor has N layers, a blur layer, N more layers, another blur layer, etc. then there are a lot of "offscreen render passes" where you have to composite arbitrary sets of layers exclusively for the purpose of blurring them.

It's true that GPU is itself not busy during a lot of this because it's waiting on pixels, but whatever is preparing the pixels (copying memory) is super busy.

Downscaling is a win not just for the blurring, but primarily the compositing. KDE describes the primary constraint as the number of windows and how many of them need to be blended:

  The performance impact of the blur effect depends on the number of open and translucent windows
https://userbase.kde.org/Desktop_Effects_Performance#Blur_Ef...
As long as the lower blur layers are not fully occluded by opaque content, then yes - they all need to be evaluated, and sequentially due to their dependency. This is also true if there is transparency without blur for that matter, but then you're "just" blending.

Note that there are some differences when it's the display server that has to blur general output content on behalf of an app not allowed to see the result, vs. an app that is just blurring its own otherwise opaque content, but it's costly regardless.

(There isn't really anything like on-screen vs. off-screen, just buffers you render to and consume. Updating window content is a matter of submitting a new buffer to show, updating screen content is a matter of giving the display device a new buffer to show. For mostly hysterical raisins, these APIs tend to still have platform abstractions for window/surface management, but underneath these are just mini-toolkits that manage the buffers and hand them off for you.)

Yeah, The “offscreen” terminology is Apple lingo:

https://developer.apple.com/documentation/Metal/customizing-...

The buffers are not that different, it really just means “extra allocation”

It's not an uncommon terminology in the WSI portion of graphics APIs, I was just pointing out that it doesn't actually mean anything to the hardware/lower stack. There are only buffers.

(There can be restrictions on which buffer formats and layouts can be used for certain things, scanout being particularly picky, but a regular window can be textured from pretty much any buffer.)

Thankfully you can probably turn it off as usual