Y
Hacker News
new
|
ask
|
show
|
jobs
by
adsharma
253 days ago
2 bits out of FP8 would be 25% 2 bits out of FP16 would be 12.5%
I've seen recent work that claimed 70% of the params are used for memorization.