| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jwrallie 356 days ago
	From my own experience with whisper.cpp, normalizing the audio and removing silence not only shortens the process time significantly, but also increases a lot the quality of the transcription, as silence can mean hallucinations. You can do that graphically with Audacity too, if you do not want to deal with the command line. You also do not need any special hardware to run whisper.cpp, with the small model literally any computer should be able to do it if you can wait a bit (less than the audio length). One half interesting / half depressing observation I made is that at my workplace any meeting recording I tried to transcribe in this way had its length reduced to almost 2/3 when cutting off the silence. Makes you think about the efficiency (or lack of it) of holding long(ish) meetings.

3 comments

dogprez 356 days ago

Others pointed out the value of silence, but I just wanted to say it saddens me when humanity is misclassified as inefficiency. The other day Sam Altman made a jest about how much energy is wasted by people saying "thanks" to chatgpt. The corollary is how much human energy is wasted on humans saying thanks to each other. When making a judgement about inefficiency one is making a judgement on what is valuable, a very biased judgement that isn't necessarily aligned with what makes us thrive. =) (<-- a wasteful smiley)

link

Philip-J-Fry 356 days ago

Well, humans saying thanks to eachother isn't wasted energy. It has a real affect on our relationships.

People say thank you to AI because they are portrayed as human-like chat bots, but in reality it has almost no effect on their effectiveness to respond to our queries.

Saying thank you to ChatGPT is no less wasteful than saying thank you to Windows for opening the calculator.

I don't think anyone is trying to draw any parallels between that inefficiency and real humans saying thank you?

link

mewpmewp2 355 days ago

Saying thank you might still make sense in theory with AI, if AI used this as a clue to learn how useful the response was. Currently there is thumbs up and down, but it is very possible that there are mid conversation effects of it in the same context.

link

kristianbrigman 356 days ago

I’ll remember that you told me thanks. Will chatgpt? (Honestly curious… it’s possible)

link

rz2k 356 days ago

I get the impression that it sets a tone that encourages creative, more open ended responses.

I think this is the reverse of confrontation with the LLM. Typically if you get a really dumb response, it is better to hang up the conversation and completely start over than it is to tell the LLM why it is wrong. Once you start arguing, they start getting stupider and respond with even faultier logic as they try to appease you.

I suppose it makes sense if the training involves alternate models of discourse resembling two educated people in a forum with shared intellectual curiosity and a common goal, or two people having a ridiculous internet argument.

link

Salgat 356 days ago

I say thanks for my own well-being too.

link

mulmen 356 days ago

Humans are inefficient. The mistake is making a moral judgement about that.

link

d1sxeyes 356 days ago

1/3 of the meeting is silence? That’s a good thing. It’s allowing people time to think over what they’re hearing, there are pauses to allow people to contribute or participate. What do you think a better percentage of silent time would be?

link

jwrallie 356 days ago

Good point, somehow if I think of a 30 minutes meeting, 10 minutes of silence sounds great, but seeing a 1 hour block disappear from a 3 hour recording makes me want to use that “free” hour to do something else.

Well, I don’t think silence is not the real problem with a 3 hour meeting!

link

literalAardvark 356 days ago

If people could speak continuously for an entire meeting then that meeting would be better off as an email. Meetings are for bouncing half formed ideas around and coagulating that into something greater.

There MUST be time to think

link

sudhirj 356 days ago

If a human meeting had lot of silence (assuming it's between words and not before / after), I would consider it a very efficient meeting where there was just enough information exchanged with adequate absorption, processing and response time.

link