| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by putlake 144 days ago
	VibeVoice-ASR is a unified speech-to-text model designed to handle 60-minute long-form audio in a single pass, generating structured transcriptions containing Who (Speaker), When (Timestamps), and What (Content), with support for Customized Hotwords.