Hacker News new | ask | show | jobs
by ggerganov 962 days ago
Heh, funny to see this popup here :)

The performance on Apple Silicon should be much better today compared to what is shown in the video as whisper.cpp now runs fully on the GPU and there have been significant improvements in llama.cpp generation speed over the last few months.

5 comments

13 minutes between this and the commit of a new demo video, not bad :D

And impressive performance indeed!

Ah, forget the other message, I watched the videos in the wrong order! And I can’t delete or edit using the Hack app!
Is it just me, or is the gpu version actually slower to respond?
You are kinda famous now man. Odds are, people follow your github religiously.
Is ggerganov to LLM what Fabrice Bellard is to QuickJS/QEMU/FFMPEG?
That's a big burden to place on anyone.
I have sent a PR to move that new demo to the top. I think the new demo is significantly better.
Is running this on Apple Silicon the most cost effective way to run this, or can it be done cheaper on a beefed up homelab Linux server?
will this work with latested distilled llama?