|
|
|
|
|
by ShamelessC
1260 days ago
|
|
Heh you might want to use an equivalent gaming GPU for the price comparison. Surely a thousand dollars spent on an RTX 4000 series card (Hopper) would outperform a P5000? I agree though, Tortoise TTS did a lot of similar work IIRC by a single person on their multi-GPU setup. Really impressive effort. Did they get a citation? They deserve one. edit: reading other comments it seems there is a misconception that the model takes 3 seconds to run? That isn't the case - it requires "just" 3 seconds of example audio to successfully clone a voice (for some definition of success). |
|
rtx5000 maybe but not sure how much of a value improvement there is