|
|
|
|
|
by ZaneHam
191 days ago
|
|
Author here, I've been collecting historical computing documentation for a few years and found Brusentsov's balanced ternary research from Moscow State University (1958-1965). Applied it to modern transformers. Some interesting results: 93.8% energy reduction per inference,
16x memory compression (7B model: 28GB → 1.75GB),
Zero floating-point multiplication,
Runs on CPUs, no GPU required and
Architectural epistemic uncertainty (it won't hallucinate what it doesn't know) Repo: https://github.com/Zaneham/Ternary_inference Happy to answer questions :-) Happy holidays and merry christmas! |
|