I think when people measure speedups here they deduct the first 100%, e.g. if I used to be able to process 10 items per second and can now process 20 items per second you can run 200% as fast but it's a 100% speedup.
To avoid the various ambiguities here, I learned to express speedups as a ratio of the optimized throughput, divided by the baseline throughput. Or equivalently: the baseline time divided by optimized time.
4400 MB/s vs 800 MB/s = 5.5x speedup or 5.5 times as fast.