Hacker News new | ask | show | jobs
by flohofwoe 880 days ago
If the emulator is entirely cycle correct for the entire system (including the video system, which is usually the most expensive to emulate, not the CPU), then the speedup on modern computers will be much less than just comparing the clock frequencies.

Consider that each chip of a home computer system (CPU, 1..2 IO/timer chips, audio, video, ...) needs to do 'emulation work' for each 1 or 2 MHz clock cycle which can add up to quite a number of host computer instructions (dozens to hundreds).

If each chip emulator just takes around 10..20 host system clock cycles to emulate one emulator clock cycle, then you are already looking at around 100 host system clock cycles per emulated clock cycle for the entire home computer (in reality it's probably worse).

Such 'vertically sliced' emulation code is also full of conditional branches which put a heavy burden on the host CPU branch predictor.

...and that's how a theoretical 1000x speedup suddenly becomes a 10x speedup, it's not the CPU emulation (this is usually cheap) but the rest of the emulated system which can be expensive.

Different emulators use all sorts of tricks and shortcuts, but usually with tradeoffs like less precise emulation, or less 'compartmentalized' code.

PS: case in point this is just the top-level per-cycle function in my C64 emulator, which in turn calls per-cycle-functions in each of the chip emulators (which may each be just as much code):

https://github.com/floooh/chips/blob/9a7f6d659b5d1bbf72bc8d0...

I'm trying to strike a balance between 'purity' (e.g. an entire emulated system can be 'plugged together' by wiring together chip emulators, just like one would build a real 8-bit computer on a breadboard), while still being fast enough to comfortably run in realtime even in browsers (https://floooh.github.io/tiny8bit/c64.html).

It's definitely possible to implement a faster cycle-correct C64 emulator with a less 'pure' architecture, but it's quite hard to do meaningful optimizations while keeping that same 'virtual breadboard' architecture.

...considering all the code that runs per cycle it's actually amazing how fast modern CPUs are :)

1 comments

that's an excellent point, i hadn't considered that they might be running this program in basic on a cycle-correct emulator of the bbc micro. thank you very much for explaining