| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by okeysmokey 593 days ago
	Is this using Whisper or something else? It looks like the site performs multiple steps on the audio and did a decent job of guessing who was speaking.

1 comments

WinH 593 days ago

Yes. We run run Whisper Large V3 (not Turbo) for the speech to text. It still seems to be the best open source model out there for that step. The main challenge we are trying to solve is Speaker Identification, which is a very time consuming process.

link

okeysmokey 593 days ago

How are you doing speaker id?

link

okeysmokey 593 days ago

It (mostly correctly) ID'd the SCOTUS justices on this one. Pretty cool! https://transcriberai.com/Overview/aa908e33-5680-462a-94ff-6...

link

LunaRoot 592 days ago

Really cool! thanks for sharing.

link