Slight technical inaccuracy at the start: the Z80 also required a minimum of 4 clocks for a memory access, it wasn't better than the 8088 in that regard.
Nope. An opcode fetch cycle takes 4 clocks, but a normal read or write is only 3.That's why an instruction like LD a, (hl) takes 7 cycles. I believe the GBZ80 always takes 4, but it's more of a separate 8080 clone that borrows from the Z80 a bit than an actual Z80.