|
|
|
|
|
by amiga386
522 days ago
|
|
> On m68k, shifting right by 1 and performing a logical AND both take 8 CPU cycles. But the right-shift is 2 bytes smaller There's also BTST #0,xx but it wastefully needs an extra 16 bits say which bit to test (even though the bit can only be from 0-31) > That makes a difference on Amiga, because (other than size) the DMA might be shared with other chips, so you're saving yourself a memory read that could stall the CPU while it's waiting its turn. That's a load-bearing "could". If the 68000 has to read/write chip RAM, it gets the even cycles while the custom chips get odd cycles, so it doesn't even notice (unless you're doing something that steals even cycles from the CPU, e.g. the blitter is active and you set BLTPRI, or you have 5+ bitplanes in lowres or 3+ bitplanes in highres) |
|
That reminds me, it's theoretically fastest to do `and d1,d0` e.g. in a loop if d1 is pre-loaded with the value (4 cycles and 1 read). `btst d1,d0` is 6 cycles and 1 read.
> the blitter is active and you set BLTPRI
I thought BLTPRI enabled meant the blitter takes every even DMA cycle it needs, and when disabled it gives the CPU 1 in every 4 even DMA cycles. But yes, I'm splitting hairs a bit when it comes to DMA performance because I code game/demo stuff targeting stock A500, meaning one of those cases (blitter running or 5+ bitplanes enabled) is very likely to be true.