Hacker News new | ask | show | jobs
by behnamoh 391 days ago
they responded to my tweet last year and said they didn't quantize the models.
1 comments

It's very hard to find right now but I'm sure they said they don't quantize KV cache, but their weights are in fp8.