Hacker News new | ask | show | jobs
by adastra22 379 days ago
There is nothing magical about GPU memory though. It’s just faster. But people have been doing CPU inference since the first llama code came out.