|
|
|
|
|
by eru
366 days ago
|
|
I wonder if you can hide the latency, especially for voice? What I have in mind is to start the voice response with a non-thinking model, say a sentence or two in a fraction of a second. That will take the voice model a few seconds to read out. In that time, you use a thinking model to start working on the next part of the response? In a sense, very similar to how everyone knows to stall in an interview by starting with 'this is a very good question...', and using that time to think some more. |
|