Y
Hacker News
new
|
ask
|
show
|
jobs
by
Tepix
60 days ago
Sounds good. I saw that you use the FP8 version of the model. Do you also quantize the KV cache?
1 comments
sacrelege
60 days ago
no I don't, since there seem to be a silent degradation bug
link