Apple's unified memory should allow running large models like 65B that will not fit on a consumer GPU, but mostly I see people talking about the smaller 7B sizes that can run anywhere.
They are leveraging Apple’s Metal Performance Shaders[1] not the neural engine. From the chart, it looks like you might get ~20x max boost on inference over plain CPU. Obviously, it's not like having RTX 4090 but better than nothing.
So the question is how much ram do you need? You and another person mentioned 96gb, the person below says he can run it with 24gb. What's the proper amount that is the best amount of ram for now? Of course 128gb/max is the best, but what's a great amount to have now. I never bought an m1, thinking of buying one now ;-)