Hacker News new | ask | show | jobs
by kristianp 126 days ago
Not many are getting useful inference out of a $500 mac mini, due to only having 16GB of RAM.
1 comments

It depends. This particular model has larger experts with more active parameters so 16GB is likely not enough (at least not without further tricks) but there are much sparser models where an active expert can be in RAM while the weights for all other experts stay on disk. This becomes more and more of a necessity as models get sparser and RAM itself gets tighter. It lowers performance but the end result can still be "useful".