|
|
|
|
|
by powersnail
1716 days ago
|
|
> What bothered me about the original implementation was the lookup table. Even though I knew they’d be cached, I still thought the memory accesses might have a detrimental affect on performance. The encoding lookup table is an array of four chars. I'd be surprised if accessing such an array has a detrimental effect on any program. I also wonder why the graphs say "Encode / Decode", as if you are combining the performance of the encoding function and the decoding. Have you considered separating them? It would also help reproducibility if you include your compiler, versions, and flags. You mentioned that you've turned off all optimizations, but I wonder why. "O0" would certainly produces a lot of jumps for the switch. But O2 is certainly going to eliminate those jumps. In fact, with O2, gcc11 seems to produce identical code for switch and lookup table. https://godbolt.org/z/jdx645MsY |
|