I think the nature/feel of the quick response is likely due to the fact the completion call is streaming "stream: true" [1]. Then, they are parsing it back to text as it streams the response back.
I'd have been more surprised if there wasn't streaming. All the LLMs I have ever used, stream tokens in real-time. I'm talking about the time spent per token.