Hacker News new | ask | show | jobs
by visarga 1250 days ago
> I assume once a model is loaded, many queries can be serviced by that model quickly.

Depends. If you have room to load the whole model, yes. If you need to swap in and out parts of the model, then it matters if you have enough RAM.