Hacker News new | ask | show | jobs
by WhackyIdeas 886 days ago
Can it run local only?
1 comments

Yes, on a consumer 4090 card it's 12x faster than real-time. We'll benchmark some older cards as well for comparison.

I think it should work pretty good with the Apple's MLX framework as well if anyone would be willing to convert it. :)

And totally private, as in no internet needed?
Yes, you download the weights once from Huggingface and you can do whatever you want with it. :) We have no cloud APIs or usage tracking of any kind.
Since it's 12x faster than real time on a 4090, I wonder how fast would it be on a small form factor device (a SBC); I get it as this is using CUDA, so I really wonder how would that perform on my nV Xavier NX (and the more common Nano's out there)...!