Y
Hacker News
new
|
ask
|
show
|
jobs
by
int_19h
395 days ago
Have you tried quantizing them down to 4 bits to save on RAM?
1 comments
justsid
394 days ago
I have found that even 2 bit quantization works, but you have to make sure you only discard the LABs (that’s what we are calling the Left Aligned Bits internally). I have no idea why it works so well but it has cut our costs significantly.
link