Hacker News new | ask | show | jobs
by aidenn0 1077 days ago
Thank you for the ggml library, by the way. It let me play around with whisper in a sane manner. To run the CUDA torch versions, I needed to shut down X to free enough GPU memory for the medium model, and the small model might require me to quit firefox. With ggml, I can use cublas and run even the large model with a huge speedup compared to CPU only torch.