| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by fxwin 122 days ago

I'm very skeptical of the advantage they're claiming here. The whitepaper [0] only compares these to full precision models, when the more interesting (and probably more meaningful) comparison would be with other quantized models with a similar memory footprint.

Especially considering that these models seem to more or less just be quantized variants of Qwen3 with custom kernels and other inference optimizations (?) rather than fine tuned or trained from scratch with a new architecture, I am very surprised (or suspicious rather) that they didn't do the obvious comparison with a quantized Qwen3.

Their (to my knowledge) new measure/definition of intelligence seems reasonable, but introducing something like this without thorough benchmarking + model comparison is even more of a red flag to me.

[0] https://github.com/PrismML-Eng/Bonsai-demo/blob/main/1-bit-b...

1 comments

riedel 122 days ago

Actually IMHO the promise would be beyond standard FP4 quants. I think the goal is more where 1.58 bit (ternary) quants are heading. Having said that it would be interesting to see performance on nonstandard HW.

link