| I do low level systems programming, so pretty different from JS-land, but I feel the techniques you should apply when doing optimization generally apply at any level/language. 0) algorithmic improvement. Obvious shit like do a quick sort instead of bubble sort (assuming N > 64, or whatever), not doing unnecessary work in a hot loop, etc 1) reduce memory footprint. The slowest part of your program is almost always just waiting for memory, unless you're doing something that's heavily CPU bound. Web applications are probably always memory bound. Reducing the amount of memory the function you're optimizing operates on reduces DCache misses, which are expensive. 2) Do batch operations. Once I've got something to a point where it's not completely braindead (which, honestly, is where I stop most of the time), I look to start batching things. Usually look to do 8 or 16 at a time in the hopes the compiler/runtime can make some use of SIMD. Use STATIC LOOPS ie (for 0..8) so the compiler can unroll the loop. That's extremely important. 3) probably unavailable (unless you want to/can drop into WASM), but the next step is usually SIMD. This is a rabbit hole, but if you want/need another ~8x perf improvement, this is how to get it 4) once all that's done, it's probably close to optimal in terms of cycles per element (unless I did something boneheaded, which is common). Last step is to multithread it if it needs even more juice. This can range from trivial to completely impossible depending on the algorithm. In JS land, you need to make sure you operate on SharedAreayBufferrs when doing multithreading for performance, because web workers copy the input/output values by default. Anywhoo.. maybe that helps. When I try to optimize something lightly, it's not uncommon for me to get 10x improvement fairly easily. When I optimize something to within an inch of it's life, I can sometimes get three or even four orders of magnitude faster. |
I also should mention that I hate flamegraphs. They only give you a bare minimum amount of information for doing performance work. I'm not sure of a good JS profiler, but what you want to be able to do is mark up the sections of code you want profiled, instead of the profiler taking random samples and squashing them all together. Look at the tracy profiler for an example