|
|
|
|
|
by woodson
749 days ago
|
|
Replacing the codebook approach with a statistical/DNN is more likely to give higher accuracy than getting rid of mfccs as spectral representation (at least in general ASR). (Arguably, using Mel spectra was the least controversial design choice made for Whisper.) |
|