Hacker News new | ask | show | jobs
by ac29 102 days ago
> 35b-A3B runs with around 14 tok/s and partial offloading

FYI, this is what I am seeing for pure CPU inference so something is likely off with your setup.

Test setup is intel 13500 w/ 6 threads and 64GB DDR4 ram, a newer system should be much faster