Hacker News new | ask | show | jobs
by palmer_fox 1024 days ago
Perhaps the wrong thread to ask this question... Is it not possible to load a model on something like an NVMe M.2 drive instead of RAM? It's slower of course, but only 5-10x if I understand correctly.
1 comments

Yes but they’re slow enough on normal hardware for that 5-10x to be painful…
Can you RAID them?
Technically yes?

But its way beyond the point where its going to help LLMs. CPU RAM is already "too slow" in machines big enough for multiple NVMe SSDs.