| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by vijgaurav 59 days ago
	The 60-minute single-pass transcription is the part that actually matters. Most ASR models chunk audio and you lose speaker continuity across boundaries. If the diarization actually holds up on hour-long recordings without drifting, thats a real solve for podcast and meeting transcription workflows.