Wouldn’t speculative decoding decrease overall throughput, but optimise (perceived) responsiveness?