Hacker News new | ask | show | jobs
by pshc 121 days ago
With batched parallel requests this scales down further. Even a MacBook M3 on battery power can do inference quickly and efficiently. Large scale training is the power hog.