Hacker News new | ask | show | jobs
Nvidia STT Parakeet v3 (huggingface.co)
6 points by MysticOracle 314 days ago
1 comments

New in v3

- Automatic punctuation and capitalization

- Accurate word-level and segment-level timestamps

- Long audio transcription, supporting audio up to 24 minutes long with full attention (on A100 80GB) or up to 3 hours with local attention.

- Released under a permissive CC BY 4.0 license

- Now supports 25 Languages:

Bulgarian (bg), Croatian (hr), Czech (cs), Danish (da), Dutch (nl), English (en), Estonian (et), Finnish (fi), French (fr), German (de), Greek (el), Hungarian (hu), Italian (it), Latvian (lv), Lithuanian (lt), Maltese (mt), Polish (pl), Portuguese (pt), Romanian (ro), Slovak (sk), Slovenian (sl), Spanish (es), Swedish (sv), Russian (ru), Ukrainian (uk)

If this is as fast as the English-only version of Parakeet, then this is gonna displace Whisper entirely!