Hacker News new | ask | show | jobs
by diggan 606 days ago
> that support fast and lossless inference of 1.58-bit models on CPU (with NPU and GPU support coming next).