Y
Hacker News
new
|
ask
|
show
|
jobs
by
f_devd
1136 days ago
I believe it's locally (inner-loop or simd op) up-cast to float8/float16/int8, but I haven't looked at the internals of llama.cpp myself