|
|
|
|
|
by cubefox
95 days ago
|
|
Yeah, "1.58 bit" is 1 trit with three states, since log2(3)≈1.58. So it's not a inference framework for 1-bit models (two states per parameter) but for 1.58 bit models (three states per parameter). Annoying that they try to mix up the two. |
|