Hacker News new | ask | show | jobs
by paulmd 3088 days ago
Denverton is much more complex than a "simple" Atom (performance of a C3958 is up to about half of an i5-7500 in single-thread, twice the total multi-thread performance). Avoton is really no slouch either. It's really not surprising that the incidence of bugs is increasing on those uarchs as the complexity grows.

The Skylake/Kaby hyperthread bug has been fixed in microcode and is no longer applicable. It's perfectly safe to run HT on these processors now.

The AMD Ryzen segfault remains unmitigated at this point in time. Phoronix rushed to declare the bug fixed because they got a binned RMA replacement but there are plenty of reports of it occurring in current-production processors to at least a moderate degree, roughly proportionate with ASIC/litho quality. It's unclear what the scope is w/r/t Epyc since Epyc is on a different stepping but also hasn't really ramped yet either. The early Epyc processors were essentially engineering samples (on the order of hundreds to single-digit thousands of samples) with no real (public) visibility into any binning that might be taking place.

The Ryzen high-address bug is no big deal, that's the kind of thing that gets patched all the time (like the Skylake HT bug). That's one thing Dan is glossing over here - there are tons of these bugs all the time and as long as there is an effective mitigation available it's no big deal.

The PTI patch can be viewed as making syscalls take somewhat longer (about double iirc). Gamers and compute-oriented workloads won't be hurt hardly at all. The average mixed-workload case sees 5% performance loss, not ideal but it's not critical either. Losing 30% is real bad though, and that's what you will get on IO-heavy workloads that context-switch into the kernel a lot.

The only real mitigation there appears to be right now is to give up hyperconvergence for now and harden up those DB/NAS servers that are going to be pushing a lot of IO so that you know there won't be hostile code running on them. That will allow you to safely disable PTI and sidestep the performance hit.

Of course, Epyc was not that good at running databases in the first place, so you still might be better off sucking it up and running Intel even with the PTI patch. It will probably depend on your actual workload and the relative amount of IO vs processing.

2 comments

> The Skylake/Kaby hyperthread bug has been fixed in microcode and is no longer applicable. It's perfectly safe to run HT on these processors now.

Only if you can actually get the fix. My main home PC has this bug and the motherboard manufacturer (ASUS) has yet to ship a BIOS update with the fix.

Your OS can deliver a microcode fixup that's installed at OS startup, Windows should do it automatically and Linux you just need to install the intel-microcode package.
>Gamers and compute-oriented workloads won't be hurt hardly at all.

Actually KPTI doesn't only affect syscall but also interrupts. It makes interrups slower, which affects every workload.

Gamers like to use PS/2 peripherals because they're interrupt based and thus more responsive than USB peripherals.

Does this mean they could take a hit due to this bug?

Interrupts from ps/2 are pretty insignificant in comparison to all other interrupt traffic on any PC.

Edit: also "interrupt-based" in this case has nothing to do with interrupts seen by the CPU. Difference is that in ps/2 the device can send data to the controller (which then generates interrupt) at any time, while for usb the controller periodically polls devices for data (and possibly generates interrupt if there are some). In the early days of USB and UHCI host controllers this polling was done in software, but since USB 2.0 this is done in hardware and generates real cpu interrupts when usb device requests interrupt (although with somewhat unpredictable but bounded latency)

Thanks, that's really useful information. I wasn't aware that USB 2.0 switched to hardware based polling.