Hacker News new | ask | show | jobs
by f_devd 1136 days ago
I believe it's locally (inner-loop or simd op) up-cast to float8/float16/int8, but I haven't looked at the internals of llama.cpp myself