| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by codethief 402 days ago
	> Maybe I've just had a bad microphone. Yeah, I would definitely double-check your setup. At work we use Whisper to live-transcribe-and-translate all-hands meetings and it works exceptionally well.

2 comments

s3p 402 days ago

+1 this. Whisper works insanely well. I've been using the medium model as it has yet to mis transcribe anything noticeable, and it's very lightweight. I even converted it to a coreML model so it runs accelerated on apple silicon. It doesn't run *that* much faster than before.. but it ran really fast to begin with. For anyone tinkering, ive had much success with whisper.cpp.

link

azinman2 402 days ago

What was the process of converting it like? I assume you then had to write all of the inference code as well?

link

tough 401 days ago

not the gp but found this https://github.com/ggml-org/whisper.cpp/blob/master/models/c...

link

Grimblewald 401 days ago

I'd agree with your experience. I simply sit my phone (~200 dollar motorola, cheap phone) in centre of room, split voice file into chunks using voice prints/ID's I get from a voice embedding model I trained, then feed labelled chunks through whisper, and get a nice transcript of everything said. I combine that with my handwritten notes (as image, get a VLM to transcribe) and the agenda, and I get out really nice meeting minutes as a LaTex document. Works a charm and has turned an hour or two of work per meeting into maybe 30 minutes (proofing what was written).

link