|
|
|
|
|
by sitkack
1644 days ago
|
|
Memory bandwidth is a complex beast, one should be able to get 50GB/s for short decodes of single to double digit k on modern hardware. The author measured 11GB/s ish in their memcpy benchmark, but only half that for decode. If memory bandwidth was the wall, then it should be closer to memcpy in perf. I could see fusing base64 decode and json parsing into a single function. It would be cool to show a demo of a processor maintaining good IPC on one hyper thread while the other hyper thread running on the same core was able to do a base64 decode. |
|