|
|
|
|
|
by throwaway48540
669 days ago
|
|
It did the same thing ChatGPT does when it picks up your writing style and exact words/sentences after a few messages. Literally - the audio is encoded as tokens and fed to the LLM, there is no distinction between text and audio from the model's point of view. |
|