Hacker News new | ask | show | jobs
by 3cats-in-a-coat 932 days ago
What's the TLDR on how... hardware performs differently on two software runtimes?
2 comments

AMD's implementation of `rep movsb` instruction is surprisingly slow when addresses are page aligned. Python's allocator happens to add a 16-byte offset that avoids the hardware quirk/bug.
thank you, upvoted!
One of the very first things in the article is a TLDR section that points you to the conclusion.

> In conclusion, the issue isn't software-related. Python outperforms C/Rust due to an AMD CPU bug.

It is software-related. Just the CPU perform badly on some software instruction.
FSRM is a CPU feature embedded in the microcode (in this instance, amd-ucode) that software such as glibc cannot interact with. I refer to it as hardware because I consider microcode a part of the hardware.