|
|
|
|
|
by polyomino
383 days ago
|
|
We encountered this problem when converting audio only LLM applications to visual + audio.
The visuals would increase latency by a lot since they need to be parsed completely before displaying, whereas you can just play audio token by token and wait for the LLM to generate the next one while audio is playing. |
|