|
|
|
|
|
by bainsfather
1855 days ago
|
|
Could someone explain what is being done here? I see they are using a GAN, and doing unsupervised training. But then they appear to compare their model to supervised-trained models. How do they do this? Do they tack a supervised-trained model onto the end of their unsupervised model? I imagine they must do supervised training at some point, else how can they convert sounds to text? |
|