Y
Hacker News
new
|
ask
|
show
|
jobs
by
magnat
657 days ago
Because you can quantize a model e.g. from original 16 bits down to 5 bits per weight to fit your available memory constraints.