Hacker News new | ask | show | jobs
by ttyprintk 582 days ago
Later a4.8 quantization by some of the same team:

https://news.ycombinator.com/item?id=42092724

https://arxiv.org/abs/2411.04965

1 comments

and the repo for this project: https://github.com/microsoft/BitNet
The demo they showed was full of repeated sentences. The 3B model looks quite dense, TBH. Did they just want to show the speed?
3B models, especially in quantized state, almost always behave like this.