Y
Hacker News
new
|
ask
|
show
|
jobs
by
terafo
1202 days ago
It fits, whisper.cpp uses 4 bit quantization, 13B model takes a little bit more than 8gb and around 9gb ram while inferencing.