Hacker News new | ask | show | jobs
by smspf 2181 days ago
I agree on the performance loss. Just for kicks, I ran the same commands on some real aarch64 (32 cores, 3.0GHz, ARMv8.? - can't remember and already logged off the machine, but I can double check tomorrow). Without further context, numbers:

  someuser@some-aarch64-machine:~$ docker run arm64v8/ubuntu bash -c 'dd if=/dev/urandom bs=4k count=10k | gzip > /dev/null'
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 2.18298 s, 19.2 MB/s
  someuser@some-aarch64-machine:~$ docker run amd64/ubuntu bash -c 'dd if=/dev/urandom bs=4k count=10k | gzip > /dev/null'
  warning: TCG doesn't support requested feature: CPUID.01H:ECX.vmx [bit 5]
  warning: TCG doesn't support requested feature: CPUID.01H:ECX.vmx [bit 5]
  warning: TCG doesn't support requested feature: CPUID.01H:ECX.vmx [bit 5]
  10240+0 records in
  10240+0 records out
  41943040 bytes (42 MB, 40 MiB) copied, 6.72324 s, 6.2 MB/s
1 comments

Awesome, thanks for testing this out!

A 3x slowdown is not as bad as 6x, but it's still quite a bit. I also saw a slowdown of ~4x when I tried this experiment on a native Linux x86_64 running ARM - perhaps the Mac -> Linux virtualization slowed it down further.

5x may have been a bit alarmist, but regardless we should brace ourselves for a big performance hit on x86_64 virtualization.

I'm surprised it's only a 3x slowdown. But the single-thread performance of native execution (without emulation) is worse on aarch64, which was expected. Imo, a better benchmark would take into account the multithread performance with/without emulation.