Hacker News new | ask | show | jobs
by dheera 1667 days ago
Why are VMs so bad at virtualizing?

An ideal VM should be indistinguishable from a real machine.

For example a virtualized system running Android should generate fake IMU data, not sit at 0 linear acceleration all the time. And have a real-looking fake IMEI, not a string of 0s.

6 comments

Hi, virt engineer here. Partly because it a very hard problem (in fact, theoretically impossible if you include timing attacks), but mainly because you don't need to emulate the hardware very accurately in order to get common operating systems to run. Getting them to run is all that we're paid to do, and that's a difficult enough job already.

One strange aspect of this is that only a narrow range of current OSes run under virtualization. Qemu is great for running, say, current versions of Linux or Windows, but absolutely terrible if you try to run Linux 1.0 or Windows 95 or Solaris/x86 or any uncommon OS. (I tried a few of these several years ago out of curiosity, and none of them would even boot.) The reason is that we don't emulate enough of the corner cases in CPUs and devices to run those operating systems. eg. The SATA device only emulates the commands issued by drivers of modern operating systems, not every single command and dark corner of the real hardware.

To be fair there are emulators that try much harder to be cycle accurate, especially the ones designed to run old games. The MisTER is the current king here, but that uses an expensive FPGA and can just about emulate a 486 PC.

So step 1 is to emulate the world within which the emulated machine exists?
That's bullshit because Qemu it's an emulator too, so it will run Solaris and W95 perfectly.

I am not a virt engineer but I could run W95, OS2 and heck, even Mac OS 9 under Qemu, recently.

Seriously, if you are a virt engineer, drop your title down :).

Qemu has an ISA pc module, and you need to disable kvm just to be sure. Set the CPU to Pentium and everything will be fine.

You might want to experiment yourself before making bold assertions, because you are wrong. I've just tried these (with qemu-system-x86-6.0.0-7.fc35.x86_64):

Microsoft_Windows_NT_Server_Version_4.0_227-075-385_CD-KEY_419-1343253_1996.iso (1996, own copy)

Installer starts, locks up with screen corruption about 5 seconds in.

https://archive.org/details/windows-95_fixcpu_iso_windows_is... (1994-ish)

Cannot read the emulated CD-ROM.

https://archive.org/details/redhat-9.0_release (2003)

Installer boots, but fails at partitioning stage, the first time it accesses the disk.

https://archive.org/details/IBMOS2Warp4Collection (1996)

Cannot read the emulated CD-ROM.

Plan 9, 4th ed. (2003, own copy)

Gets quite far, up to the login, although with a lot of errors, but later hangs hard. (Out of all of them this looks closest to being possible to make work.)

I can also tell you that we're moving away from emulating i440fx entirely (to q35), and nothing prior to 2005 will work once that change has been made. In addition, changes to how virtio works means that guests before about 2010 that use virtio will have problems unless you take special steps.

Ah, sure?

1) Don't use Qemu from the kvm binary.

2) Don't use VirtIO

3) Don't set the CPU higher than a Pentium for w95/w98/NT4, Pentium2 may be fine for w98SE.

   >qemu-system-i386  -M help 
   isapc                ISA-only PC

   >qemu-system-i386  --version
   QEMU emulator version 6.1.0
   Copyright (c) 2003-2021 Fabrice Bellard and the QEMU Project developers
This should work for NT4

   qemu-system-i386 -cdrom $CDROM -m 32 -vga cirrus -net nic,model=pcnet -net user -cpu pentium -hda $DISK
Also, if qemu enables kvm by default, set the machine acceleration method to TCG.

Bye, "virt engineer".

What's your parameter for Qemu?

Could you provide link to discussion that i440fx going to be removed?

Nah, he is a "virt engineer", ofc he/she doesn't have a clue on Qemu without KVM ;).

JK. Old stuff it's difficult to emulate if you only know qemu from KVM and you didn't use Qemu since the Bellard days, or Bochs.

Meanwhile, I emulated w95, w98, Linux 2.2/2.4 based distros (my first Linux), OS/2 and so on just fine. Even BeOS.

I installed NT 3.51 in Qemu not long ago. I can look up the Qemu settings I used if it's of interest.
How does software-based x86 emulation (ie OG Connectix Virtual PC) compare to current hardware-assisted virtualization? Were older methods more cycle accurate than what’s in use now?
You reminded me of my father showing up home one time (around 2005, I was 7-8) proudly showing a random CD. Then after few hours he called to show off a virtual Windows 98 PC running in a window on our Windows XP computer. I was fascinated, total awe for a few minutes. Virtual PC became the basis for my experimentation with Windows Server 2003 and newer + Windows clients (even multiple networked PCs ran nicely!), later Linux servers inside Virtualbox, and led to my career in software engineering.

Anyways to answer your question, Virtual PC and VirtualBox can fully run old as well as new software, and the performance hit is not that bad (I ran multiple virtualized Windows Servers when a PC had 1GB of RAM). However more modern virtualization methods can offer bare metal-like performance, which Virtual PC/Virtualbox will never be able to make.

Thank you for your answer but also thank you for making me feel like I’m an old man.

I was in my last year of high school around the time you mentioned when I was experimenting with running Windows NT with a copy of Virtual PC.

Real hardware is finicky and complex. It would be very slow to virtualize every hardware device in a system to a level not distinguishable to software. If you do shoot for complete accuracy (e.g. projects like 86Box), you take at least a ~100x performance hit, and also lose out on useful features like dragging files into/out of the VM.
For anyone interested in this, read through the Dolphin emulator reports [0].

Specifically, look for examples of bugs they've fixed, and why they were triggered.

At this point, they're essentially all of the "X software depended on a quirk of Y feature, to do (whatever), because the developers chose to do it that way." For that one specific piece of software, and nowhere else.

And that's for a game console with highly standardized hardware and libraries. The general purpose computer has a bit larger mutation surface. :-)

Or, to crib from another sibling poster,

"You have a million places to make sure your virtualization looks like the actual artifact. Of those, 100 are used by everything, 1,000 are used by many things, and 10,000 are used by a few things. The remainder may be used by some piece of software out there, somewhere."

"You have a year to build a working product. Are you going to implement and equally test all million things?"

[0] https://dolphin-emu.org/blog/

> An ideal VM should be indistinguishable from a real machine.

Ideal for what purpose?

virtio is a good example of where that breaks down. For a lot of use cases, directly exposing an explicitly virtual device rather than emulating real hardware can be much more efficient and avoid bugs.

For example, it may help a virtualised system avoid some layers of caching or optimisation if they are redundant because they are nested inside a system already doing that.

Making your VM indistinguishable from real hardware is nice for some use cases, absolutely, but in many it isn't what you want.

> Ideal for what purpose?

To shove it at companies like Tencent who will ban you for trying to run WeChat in a virtual machine, and restore freedom to the user to run software how they want. WeChat also randomly scans for Wi-Fi networks, I'm guessing they sniff VMs with tricks like that.

It should also be a violation of disability law to force users to use a hardware mobile phone to run a particular piece of software. VMs open the doors to custom accessibility solutions.

They shouldn't even have the right to know what it's running on, they should just hand me bytecode of a suggested (but not required) client, and open a port on their server for service.

Also in general to shove it at any company with potential spyware. I always run unknown closed-source software in a VM and I should have the basic right to do that. But sometimes those companies try to detect VMs. If the VM engine is good enough they shouldn't be able to.

Sure, my point wasn't that there is no use case, just that there are use cases where it isn't necessary and—more than that—is counterproductive
The goal is usually cooperative virtualisation, not adversarial virtualisation. Most people don't need to hide that the environment is a VM, because the OS and applications by and large don't care about that.
I talked to a security researcher about it a few years ago and as I understood it it's a cat and mouse game. They are trying to mimic real phones but the malware authors always find a new way to tell whether it's fake.
I’m not aware of any steps security researches take to obscure the fact they are running in a VM from malware.
Thank you!
VM detection and escape (breaking through the VM to access the host machine) is an active area of research and a very hard nut to crack. It's trench warfare!