Y
Hacker News
new
|
ask
|
show
|
jobs
by
skavi
582 days ago
and the repo for this project:
https://github.com/microsoft/BitNet
1 comments
sinuhe69
582 days ago
The demo they showed was full of repeated sentences. The 3B model looks quite dense, TBH. Did they just want to show the speed?
link
newswasboring
582 days ago
3B models, especially in quantized state, almost always behave like this.
link