| > Your benchmark doesn't match the experience of people building games and applications on top of WebGPU Here's an example of Bevy WebGL vs Bevy WebGPU: I get 50 fps on 78k birds with WebGPU: https://bevyengine.org/examples-webgpu/stress-tests/bevymark... I get 50 fps on 90k birds with WebGL: https://bevyengine.org/examples/stress-tests/bevymark/ So you test the difference between them with technically the same code. (They can get 78k birds, which is way better than my triangles, because they batch 'em. I know 10k drawcalls doesn't seem good, but any 2024 computer can handle that load with ease.) Older frameworks will get x10 better results , such as Kha (https://lemon07r.github.io/kha-html5-bunnymark/) or OpenFL (https://lemon07r.github.io/openfl-bunnymark/), but they run at lower res and this is a very CPU based benchmark, so I'm not gonna count them. > be limited by the fill rate of your GPU They're 10k triangles and they're not overlapping... There are no textures per se. No passes except the main one, with a 1080p render texture. No microtriangles. And I bet the shader is less than 0.25 ALU. > at which point you should see roughly the same performance across all APIs. Nah, ANGLE (OpenGL) does just fine. Unity as well. > a lower GPU usage could actually suggest that you're bottlenecked by the CPU No. I have yet to see a game on my computer that uses more than 0.5% of my CPU. Games are usually GPU bound. |
I think a better comparison would be more representative of a real game scene, because modern graphics APIs is meant to optimize typical rendering loops and might even add more overhead to trivial test cases like bunnymark.
That said though, they're already comparable which seems great considering how little performance optimization WebGPU has received relative to WebGL (at the browser level). There are also some performance optimizations at the wasm binding level that might be noticeable for trivial benchmarks that haven't made it into Bevy yet, e.g., https://github.com/rustwasm/wasm-bindgen/issues/3468 (this applies much more to WebGPU than WebGL).
> They're 10k triangles and they're not overlapping... There are no textures per se. No passes except the main one, with a 1080p render texture. No microtriangles. And I bet the shader is less than 0.25 ALU.
I don't know your exact test case so I can't say for sure, but if there are writes happening per draw call or something then you might have problems like this. Either way your graphics driver should be receiving roughly the same commands as you would when you use Vulkan or DX12 natively or WebGL, so there might be something else going on if the performance is a lot worse than you'd expect.
There is some extra API call (draw, upload, pipeline switch, etc.) overhead because your browser execute graphics commands in a separate rendering process, so this might have a noticeable performance effect for large draw call counts. Batching would help a lot with that whether you're using WebGL or WebGPU.