Hacker News new | ask | show | jobs
by gavmor 726 days ago
What method is that? Layer offloading?
1 comments

Yes, it's either that, or CPU inference. The article doesn't say.

It doesn't mention quantization either.