Apple throws a lot of transistors at their 4 performance cores in the M1 to get the performance they do - its not clear that approach would realistically scale to a workstation CPU with 16, 32, or more cores (at least not with current fab capabilities).