Hacker News new | ask | show | jobs
by kjok 1046 days ago
What's the reason for the high inference latency? Any ideas on how this could be improved?
2 comments

TorToiSe is composed of many large models: GPT-2 for text encodings, as well as a large VQVAE encoder + large diffusion model decoder.

Only the big spaghetti inference code (+ weights) has been published, so there's a high entrance barrier for re-training / improving it.

It has been sped up, but still not fast enough for this use case. https://github.com/manmay-nakhashi/tortoise-tts-fastest