Whisper doesn't list its training data (or code?), so can't be an open-source model, just an open weights model.
Parakeet does list its training data, and at least one of those is not FOSS, but some of them definitely are FOSS. I wonder if they nVidia would create a fully FOSS model by retraining on only the open data.
Parakeet does list its training data, and at least one of those is not FOSS, but some of them definitely are FOSS. I wonder if they nVidia would create a fully FOSS model by retraining on only the open data.
https://huggingface.co/nvidia/parakeet-rnnt-1.1b#datasets https://catalog.ldc.upenn.edu/LDC2004T19 https://catalog.ldc.upenn.edu/license/ldc-non-members-agreem...