|
|
|
|
|
by brrrrrm
590 days ago
|
|
WebGPU cannot even come close unfortunately since they don't have support for hardware specific memory or warp-level primitives (like TMA or tensorcores). it's not like it gets 80% of perf, it gets < 30% of the peak perf for anything related to heavy compute matrix multiplications |
|
I have no experience with WebGPU but if you mean group shared memory, I think the support is available. See the demo: https://compute.toys/view/25