No, the classical simulation should give you the whole wavefunction, from which you can get the full probability distribution. One run should be enough.
Does it scale linearly? Is it 5 days on a half as good super computer? They just need to show that it can be done in any reasonable finite amount of time, not 2.5 days.
The main bottleneck is memory/storage capacity, not speed. Beyond that (if you have the requisite storage capacity and bandwidth), yes: if you have half the FLOPS and double the time, you get the same result.
Is this the case? I thought that parallel speedup is not linear. so it might actually not be 2x slower but maybe 1.5x or something depending on IPC overhead and all that